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ABSTRACT 

Model-based methods for estimating the population 
mean in stratified clustered sampling are described. The importance 
of adjusting the weights is assessed by an approach considering the 
sampling variation of the adjusted weights and its (variance) 
components. The resulting estimators are more efficient than the 
jackknife estimators for a variety of datasets obtained from the 1990 
Mathematics Trial State Assessment of the National Assessment of 
Educational Progress (NAEP). The methods can be extended to two~stage 
clustering. A general method for estimation of more complex 
population summaries, such as regression coefficients, is outlined. 
There are no distributional assumptions in model-based methods, apart 
from the normality of the sample means. Model-based methods use only 
the final adjusted weights; the replicate weights can be disposed of, 
thus radically reducing the size of the dataset and simplifying data 
handling procedures. The principal advantage of the model-based 
methods is in efficiency and small bias of the estimators of standard 
errors for the population mean. Contrary to theoretical claims, the 
K-.EP operationally implemented jackknife estimator of the sampling 
variance is not unbiased. Eleven tables and 7 figures are included. 
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Foreword 



The Research and Development (R&D) series of reports has been initiated 

1) To share studies and research that are developmental in nature. The results of 

such studies may be revised as the work continues and additional data become 
available. 

2) To share results that are, to some extent, on the "cutting edge" of 

methodological developments. Emerging analytical approaches and new 
computer software developments often permit new, and sometimes controversial 
analysis to be done. By participating in "frontier research," we hope to contribute 
to the resolution of issues and improved analysis. 

3) To participate in discussions of emerging issues of interest to educational 

researchers, statisticians, and the Federal statistical community in general. Such 
reports may document workshops and symposiums sponsored by NCES that 
address methodological and analytical issues or may share and discuss issues 
regarding NCES practice, procedures, and standards. 

The common theme in all these goals is that these reports present results or discussion 
that do not reach definitive conclusions at this point in time, either because the data 
are tentative, the methodology is new and developing, or the topic is one on which 
there are divergent views. Therefore the techniques and inferences made from the data 
are tentative and are subject to revision. To facilitate the process of closure on the 
issues, we invite comment, criticism, and alternatives to what we have done. Such 
responses should be addressed to: 



Emerson Elliott 
Commissioner 

National Center for Education Statistics 
555 New Jersey Ave. NW 
Washington, D.C. 20208 
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Abstract 



Model-based methods for analysis of surveys with stratified clustered design are discussed and applied 
to the 1990 XAKP Trial Stat Assessment. The principal advantage of the mod el- based methods is in 
statistical efficiency, and computational simplicity for regression analysis. Model-based methods dispense 
with the replicate weights which form a large pari of tlx- survey data. 

Som( krij irords: clustering, regression, stratification, variance components. 
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1 Introduction 



Jusi like other large scale surveys. I hose com prism," (Ik- Hlflt) Math Trial Slate Assessment Program have- a 
mm pi ex sampling; (lesion several features f>F which invalidate st at isl iral analyses based on routinely adopted 
assumptions. A large part of this report is concerned with efficient estimation of tin- mean of proficiency 
for a population of students within a stale, and of the standard error of such estimators, that take account 
of t lie salient features of the survey design. 

Wo briefly summarize these leal tires of the sampling design and indicate our approach. The students 
in the sample are associated with (unequal) sampling weights; they are clustered within schools, ami 
schools are assigned to groups which for the purposes of analysis are regarded as strata. Kadi stratum is 
represented in the sample by a small number of schools (two for most strata), and most selected schools 
an- represented by 2i) 'Mi si udenls, The sampling procedure?, at the school level (sefeeltoii of schools) and 
within a st lerted school are conditionally independent given the selected schools. 

Tin- (l(sigti nayhl is the reciprocal of the probability, intended by the sampling design, of selecting 
a given student into the survey, l! is tin- product of the (intended) probability of selecting the school, 
and of the (intended) conditional probability of select itrg tin- student given selection of his/her school. 
Nonrosponse ol schools and indiv idual students is compensated by adjusting tin- weights. This adjust nieni 
is not precise in the sense that the adjusted weights are not reciprocally proportional to I Ik- conditional 
probabilities of inclusion in I he survey, given that the population of interest contains nun respondents, 

I In' adjust iii'-nl depends on the sample drawn, and it is there-fore meaningful to regard the adjusted 
weights as random variables, for each subject we consider the (unknown) average adjusl meiil over all the 
samples thai could have been drawn, and the actual adjustment calculated based on I lie drawn sample 
aud its pattern of non-response The variation over the hypothetical samples of flu- normalized difference, 
or the log-rat io, of t hose two cpiant ilios is an informal ive summary of the weight adjust menl . 

'I he outcome variable is the proficiency score. Tin;, score is defined by reference to a model relating 
the st inient 's ability ami it em characlerist ics t< . the probability of correct response, mo Mi»dcvv ( I \)S I ) for 
details, 'flic proficiency wore i> itself est imated from the students' responses to cognitive items. A set <>r 
five < rchnnijc able eslimales of t 1m- proficiency score, called t he pfausrbh t abus, art- defined f< >r each st inleiil . 
In addition to the general proficiency scale for mathematics mibscales are defined for live content ar'-as 
within the domain of inntheinat ics. flu- report focuses on the general proficiency scores, but the methods 
presented arc- also applicable to the stibscon-s. The methods used to obtain I lie proficiency scores and t heir 
actual values are accepted without criticism. 

'f In- com [nit at ioiial a Igor it Inns d escribed in this report are implemented in the st at isl ical package Spins 
(Becker, Chambers;, and Wilks. HUSK), and souk- of t hem an- documented in tin- Appendix. The principal 
advantages of Spins over statist ical soft wan- established in quantitative ethical ioual research (such as SPSS. 
GLIM, or SAS) are flexibility (in both interactive and batch modes), ease of development of complex 
programs (functions), high quality graphics, and integrity of the environment generated by the defined 
data, functions, and other objects, 



1 

.11 



2 The sampling design 



"I lit* 1 original intent ion was io draw from the popnlat ion of eight h-graders ir. each pari ifipat itig state or 
territory a sample of 105 schools and 30 students from each selected school that lias more than .'io eighth- 
graders, and nil f It* - students from schools with fewer than :\~) eighth-graders. In some states a small number 
of schools were included in the sample with certainty, and a larger number of students was drawn from 
each of these 'certainty' schools. For states (or territoric s) with fewer than 10") schools each school wouid 
be included in the sample, fit states with an appreciable proportion of slndenls in small schools ('small' 
meaning fewer than 20 eighth-graders), aggreg; le units (sets of schools) containing more than 20 students 
would be the units of sampling. Several factors intervened in this design, including non-cooperation of 
schools (school districts) and non-response of students, ;uid incomplete and inaccurate informal inn relevant 
to the sampling frame. Some states and territories (such as. Delaware. (main, and Virgin Islands) have 
fewer than 10o schools. Allowing for non-response and small schools, it was expected that the sample for 
each slate would comprise at leasl 2000 students from at least schools. 

2.1 Elements of the sampling design 

2.1.1 The sampling framo 

The sampling frame (the list of schools in the slate) was const run ed using several official sources, such 
as \'('KS Common Core of Ditto and Quality Hducation Data. Inc. The frame Consisted of a list of all 
schools in the stale thai have eighth-grade students, the (estimated or exact ) number of eighth-graders, 
or the exact number in a previous year, ami the stratifying variables, delined for I he school district or 
another adiuinistrat ive unit: urbanicity (city, suburban, and 'oilier'), median household income (grouped 
ordinal categories), and. where prevalent, minority enrollment (high enrollment of black and/or Hispanic 
students). See Kodler (lf)Ol) for details. 

Schools with fewer than 20 eight h-grade students ('small' schools) were cither attached to sehools in 
their geographic proximity to form units with more than 20 students or aggregated into units with 20 or 
more st udents each. 

2.1.2 Selecting thv. sehools 

A natural ordering of tin- strata was defined, combining urbanicity and minority enrollment. The schools 
were sorted in a 'serpentine' order, from the lowest median income to the highest in the first stratum, from 
the highest median income to the lowest in the next stratum, and so on. 

In some of t he slates a small number of schools, (', were included in the sample with certainty. I he 
rest of the schools are referred to as non-certainty schools, from the sorted list of non-certainty schools a 
system at ir sample was drawn, wit h a random starl , probability proportional to school enrollment , and step- 
length snefi as to ensure that K schools would be selected. For most stales C + l\ = 10S. This systematic 
sampling scheme is best illustrated as follows: The non-certainty schools are represented on a. straight line 
by segments of lengths proportional to eighth-grade enrollment. A step-length « for a systematic sample 
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Figure 1: Weighted systematic sampling of sHiuols. 
Notes: The schools are driincat td by short tirks. the groups by long ticks. The- sampled points arc indicated by 
long dotted ticks (points M. /.'....). Median income level (three categories) is indicated by asterisk. The segments 
of the selected schools (containing the sampled points) are underlined. 

from this lino is set. and the start is chosen as the point in distance /, ("mm tin 1 origin, where- / , is a draw 
from 'he uniform distribution on (U,,s). Further sampled points are /, + ks, k ~ l,'J,.../\ - 1. where 
K is the desired number of points; s is chosen so that K s is the length of the segment corresponding to 
all the schools. The- schools which correspond to the segments containing the sc-h-cted points are- included 
in the survey. Figure 1 illus<:at<-s this sampling procedure. In tin- diagram each of the- 39 'schools' is 
P-pre-sented by a se-gme-nt eh-lineat.ed by short ticks. The urbanicitydiy-minority categories are- delineated 
by longe-r ticks, and the points drawn by systematic sample of size 10 are indicated by long dotted ticks. 
The- se-gment of the corresponding school is underlined. The asterisks under tin- se-gnn-nts indicate the 
median income cat i-got ie\s for the schools. 

ArrangoiiM-nts for substitution of the non-cooperating schools in the sample- are- ch-scribe-d in Ko![ler 

(l'Hll ), 

2.1.3 Re.plieatn groups stratification 

The de-sign of the survey for a state has a number of feature-s that cannot be- explicitly modelled. The 
•re-ference" model that is considered by the NAEP analysis staff as well as by other researchers is that 
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Table 1: C 'liisloring structure of the New .Jersey sample 



( ,r ' H1 l )s Numbers of selected MtldtMsts within clusters 



1 




■29+20 


21 + 28 


22 + 26 


2:i+l!) 


2.1 +2.1 


:{()+2:( 


1!l+27 




1 i 


■>:)+2'.i 


21+2 t ) 


28 


21+21 


29 + 28 


26+29 


21 


1") 


21 


20 + 26 


2!)f27 


2ti 


20+20 


20+26 


21+22 


22 


22 


28 


28+28 


2<l+26 


22 + 2-") 


26+21 


26 + 2.1+21 


11+1 1 


21+26 


2'.i 


:« 


2,h+21 


22+2(5 


26+28 


22 + 26 


22+21 


27+2.1 


22+11 


■ 11 


■12 


2,1+2:1 


21+ in 


22+2.", 


21+2!) 


28+27 


21+26 


28+26 


I!) 


2!)+2!> 


29+26 


20+26 


2.")+2.'i 


28+26 


21+21 


54+27 


of) 


11 


27 + 2!l 


28+2!) 


27+27 


2.1+27+2") 







Notes: Kach entry of the table contains the numbers of selected students within th, clusters in each S roup I ■ 51. 
Tor example, group ! has a cluster with '_'!) and one with 20 students in the sample. Coups M and 5.S have three 
clusters each in the sample, '['lie count for the certainty school is printed in boldface. 

of a stratified weighted clustered sampling. Tin- strata' in this context are defined after selection of the 
schools. To avoid confusion with the strata defined by crossclassilicat ion of urban icily, median income, and 
minority composition we refer to (hem as nphcak group*. It was decided that each state vouhl have 16 
replicate -roups (with a few exceptions), J he procedure of forming replicate groups is described in Koffier 
(IWI) and justified in Johnson and Rust (1992). Most ofthese groups comprise a pair of clusters, others 
have euher tim e or just one cluster. For < .perat ional convenience, empty replica;.- groups (erwl ainiitg no 
duster.,) are declared so that the total number of groups is 16, Tin- main purpose of this is to have a 
uniform formal for tin- user tapes for all tin- states and territories. 

2.1. -4 Selecting students 

The school districts were requested to compile lists of all t he eight h-grade st udoiils in I he selected schools, 
I'romeach selected school with enrollment ofmorethan 11 eighth-graders a random sample of 30 students 
was drawn without replacement. From schools with fewer than 16 students all students were included in 
die samph-. I„ order to ensure that each student had approximately the same chance of being included in 
He- samph-. schools with fewer than 20 students wen- dimmed- (preselected); they were excluded from lite 
sampling frame with probability inversely proportional to tin- total number of students in small schools. 

lor example, the New Jersey sample is described by the clustering structure of selected students within 
schools, and of the selected schools within groups, give,, i„ Table 1. The samph- contains one certainty 
school (i„ stratum ■!!)) with 60 selected students of whom six did not cooperate with the survey. A few 
-mall schools were aggregated into dusters containing at least 20 si udents each . Most clusters have between 
20 atid 10 students u, the .samph-. The sample comprises 2710 students from 101 clusters (consolidated 
schools) in 11 groups; two groups are represent ed by three clusters each and four groups by one clus.er 
each. In the process of selecting tlx- sample of schools Hire,- small schools were -thinned' out. There are 
three empty groups (.11 through 16). 
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Figure 2: Ailjnatml wojglils for Now Jersey. 

Notes: the horizontal axis is (he cluster index [1 !U1| and the vertical axis is the adj'Med sampling, weight. 

Duplicate values of the weight in a cluster are represented by a single dot. 

2.1.5 Sampling weights 

Fach s.'h(H,l (tirii <■ 1 1 1 st < •]- ) is associated with a st lw«l-lc it I design (base) weight and each student with a 
sin tit nl-l< rt I ( wit bin-school] design wni^ht . The sampling of students is con d it ion, -illy independent of the 
sampling of schools, given tlic selected set of schools, and so each student's design sampling weight is 
equal to tlii' product ol these two weights. After the sampling procednr" and administration of f he survey 
quest ionnaire t In 1 st n dents' weights are adjusted for not i- response. Figtrre 2 contains a plot oft liese ndjusft J 
weights for t ho New .Jersey sample. Note t hat t he weights have little variat ion within clusters; in fact . most 
clusters have only two dist inct and in most cases not very different , allies of t he w. iajit . There are t liree 
clusters wit h weights about t hive t hues larger than t he rest of t he clusi ts. 

2.2 Proficiency scores 

The student questionnaire contains items of four kinds: 

• socio-driMographic and background 

• unit iidinal 

• experiential 

<» <'< >ti;nit i ve . 

I he b.vkgn unid it; iin« | >roiii' I he faniih environment and t he educat tonal |ev< I nf parents. At I it ndinal items 
re I, a I e to student *s fa mi liarity with enlcubil ors and computers, to perception of nsefid ness < if mat hemat ics. 



o 

ERIC 



15 



and tlif like. Questions about mathematics classes taken are an example of exponent ial items. Cognitive 
items are mostly multiple choice items, and are scored as correct or incorrect. Based on these scores a 
number of (sul>-)scales are defined (Measurement, Data analysis and statistics, Geometry, and the like). 
Our discussion is restricted to the composite scak, based on all the cognitive items. 

For a given scale a proficiency score is defined for each student. It is estimated from the item-level 
scores by an item-response method (see Lord, 1980, and Mislevy, 1985, for background). The estimation 
is strengthened by incorporating (conditioning on) information contained in the student- and school-level 
background variables. The proficiency scores are subject to uncertainty, and they are represented for each 
student by a set of five plausibk ralu(s. 

The teacher quest ionnaire contains items about the teacher's qualifications, teaching methods, and 
about emphasis on elements of the curriculum. 

Figure 3 presents a compact graphical summary of the proficiency scores and final weights. The profi- 
ciency scores are represented by the first set of plausible values. The [dots on the left-hand side .summarize 
the distribution of proficiency; at the lop its values are plotted, and the wit liin-clustor means are joined 
by a solid line; at the bottom the wit hin-clnster standard deviations arc plotted. The right-hand plots 
summarize the association of sampling weights and proficiency. At the top the two sels of qnaiitil ics and 
at llio bottom their wil hin-rluster means are plotted. 

2.3 Notation 

In general, we use capitals to denote quantities that refer to the population, and lowercase characters 
lo denote sample quantities, for example, .V stands for the population size (number of students in the 
population), and n for the sample size (number of students in the sample). The proficiency score for 
student / (in tin' population) is denoted by )). For students in the sample we use three indices, ijk, for 

■>! udi-ul i — ! it j in cluster /' — I »u- in group k — 1 A', and by f/i,.;.- we denote the proficiency 

score of si udeitt ijk. Implicit ly. we have introduced njt as the number of sampled students in the cluster 
jk and as (he number of sampled clusters in group k. The population counterparts for and rz/i- are 

denoted by Xji- (nnniber of students in cluster jk. j — 1 Ut) and .Ujt, respectively. The proficiency 

scores and the adjusted (final) -ainpling weights for the sampled students are denoted by (/,_,*■ and W,jk- 

respee lively. For I he plausible values we use another index, h — 1 "i. so that y tJ n is the plausible value 

h fur si udi nl ijk'. 

Model parameters are also denoted by lowercase, such as //, and their estimators are denoted by I be 
same characters with dials', such as //. When (here are several estimators for a single parameter I hey are 
distinguished by an (additional) index. In notation we do no! distinguish between an estimator (a function 
of Hie dala. considered as a random variable) and an estimate (the realized value of I he esl inialor lor the 
drawn sample). The sampling variance of an estimator, sa\ //, is denoted by var(//). and the estimate and 
estimator of this variance is denoied by var(/j). 

For random variables we use lowercase (hook characters n, J f. and for parameters /( (mean), <t-' 

(varianre). p (cnrndai ion I. r (variance ralio), and the like. The expeclal ion of a random variable, say -;, 
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Figure li: Final weights and proficiency scores for tin- Now Jersey sample. 
Notes: The plots are: A. I'roficiency by cluster, with witliin-duster mean proficiencies joined by solid Fine. H. 
Proficiency by final wei R lit (students). C Wil hin-eb.sk r standard deviations of prolieicney. and 1). Wil hin-rhister 
mean proficiency by mean weinlil. 
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is denoted by Eh). When it is necessary to distinguish between taking expectation over the samples and 
over the students this is explicitly stated. 

Vectors and matrices are denoted by bold characters. Latin for constants. Creek for random vectors 
ami mat rices. 

3 Model-based estimation of standard errors in stratified clus- 
tered sampling 

In this section the model-based method of Potthoir. Woodbury, and Manton (1!I92) is implemented for 
estimation of the population, and subpopulation means in the stratified clustered sampling design used in 
t he NA MP State Assessment Program. The method relies on a superpopulat ion approach and has several 
features of the standard analysis of variance. 

In the previous section we identified the following features of the sampling design: 

1 . sampling weights; 

'2. clustering (students within school^); 

.'i. stratification (replicate groups); 

I non-response; 

■>. indirect measurement (estimation) of the outcome. 

We insider first models and estimation procedures that accomodate each „r these features on their own. 
and then constrnci a model that combines all ofthese features. 

3.1 {Modelling features of the design 
3.1.1 Sampling weights 

LH h - ' = 1/2 '' 1,0 Pr"ficienc} seres for the population of .V students („«y. j„ a state). The 
population mean is defined as 

(the summation is over the entin- popnla.ion). When proficiency scores are available only for the sampled 
students. I - 1,2 ,,. ,I„, ,,„,»„ la! ir.„ mean is commonly estimated by ,|„, weighted mean 

(ihesnmma.,„nsareov,r |I , , ;i ,uple,l students), where „■ is I he sampling weight assoemted with student 
i- When the wei s ht.s „• ale constant . equal ion f 1 ) e, ,„,<•, dos with the umve,g|,ted mean. 

„ „ 2>. 
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In t his case the common (non-zero) value of t he weights is irrelevant for t lie est i mat or in ( 1 ). More generally, 
applying a (positive) constat)! multiplicative factor on the weights (that is, changing tr, to ("»■,■ for some 
C > 0) does not affect the estimator of the mean (1). 

For equal weights the sampling variance of the estimator (1) is estimated by 

var(//) = ^ . 

)i - 1 

A natural extension for unequal weights is tin- equation 

2_, u; - 1 

Equation (2) is not invariant with respect to constant multiplicative factors of weights, .uul so a 'reasonahle' 
choice of normatizafwn for the weights w, is essential, In one normalization the sample mean of tin- weights 
is equal to unity, that is, JT Cu; -- n. or C = »/^,- • Another choice is defined hy the requirement 
that the total of the weights be equal lo the total of the squares of the weights: 

E r *> = 

that is, C = Yli M'i/HI,- «',"'■ l' or a probability random sample the total of the weights normalized in 
this manner, (J2, ! ''if J /H» !, f • ' s 1 ' 1> f (>rr<>t l 1(1 as 'he effective sample size (Potthoff <7 ai, HHl'iJ. For this 
normalizai ion of weights tlx- estimator in {2) is unbiased. 

The sample size n is greater or equal to tin- effective sample size n_.\ . Their ratio, n/n.\, is referred to 
as the design ({feci due to weights. It is equal to unity when tin- weights are constant. 

3.1.2 Clustering 

flustering is usually represented in statistical models by a noii-negat ive correlation atnonglhe ohs< rvations 
within a cluster, or by cluster-specific "effects" deviations of withht-cluster means (or other summaries) 
from the corresponding population summary. These two model approaches are essentially identical, and are 
often used interchangeably, In the latter aproach we have a decotti posit ion ofthe variance- of the outcomes 
into its within- and bet ween-cluster components, These we denote by a'f,- and rr'jf . respectively. The 
variance ratio r is defined as r = rfi/nw ■ The wit bin-cluster correlation is equal lo(/» = ) f jj/t^iv +< r /< ) = 
r/(l+T). 

It is intuitively appealing to assume that the outcomes of students within a classroom are positively 
correlated because students in a classroom share tin- same educational processes and experiences. The size 
of this correlat ion exacts an influence- on t he estimators of t he poptilat ion mean , This can best be illnst rated 
by considering two extreme cases. If the outcomes for students within a school are perfectly correlated, 
that is, they are equal, then the outcomes for tlx- students from a school are perfectly summarized by the 
data for any one ofthe students. If the sample comprises n students from m schools < n) there are 
only in 'essential' observations, not n, On the oilier hand, if the wit hin-school correlation vanishes there 
are n essential observations. In intermediate cases (small positive- correlation), it is reasonable to expect 
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that the sample- at hand is as informative (its sample mean has as small a sampling variance) as a random 
sample of somewhat smaller size. The ratio of these sample sizes is referred to as the dtsi'jn (fject due to 
clustering. 

The sampling variance of the arithmetic mean /( =; .'/)/" i s equal to 



it is an increasing function of the correlation p. and of the variance a^ v . Note tha( there may he more 



when the clusters have a wide range of sample sizes. In an alternative estimator the influence of an 
outcome (its weight) from a large cluster would he smaller than from a small cluster. Clearly, for an 
efficient t. ;imator these "weights' have to depend on the wit hin-clusl cr correlation p. 

Clusters may have unequal wit.hin-cluster variances. Then equal wilhin-eluster covariances across the 
clusters do not correspond to equal variance 1 ratios. 

3.1.3 Stratification 

Stratification is an important device for reduction of sampling variation of the estimates from sample 
surveys. It can he interpreted as a partitioning of the target population into an exhaustive set. of non- 
overlapping subpopulat ions called strata, and carrying out a separate sample survey for each stratum. 
The population mean and other quantities of interest can he estimated by combining the corresponding 
estimates for the strata. The target population may exhibit substantial variation, but variation within 
each stratum may be much smaller. The parameters referring to a stratum can then be estimated with 
high precision, even if such estimates are based only on a fraction of the sample. 

Clearly, a key to successful stratification is in identifying a small number t f strata with distinct stratum 
means or. more generally, attributes and characteristics strongly associated with the variables of interest. 

We adopt the approach of the NAEP operational analysis and regard the 56 replicate groups as the 
strata. In standard survey practice strata are defined for the target population (or the sampling frame) 
prior to sampling. To avoid confusion with the stratification of schools in NAEP, defined by median income, 
minority composition, and nrbanicity, we use the term group for each of the 56 replicate groups. 

3.1.4 Adjustment for non- response and poststratifie.ation 

Having been selected into the survey, individual students or entire schools may refuse 1 to cooperate. If it 
is feasible within the practical constraints, a non-cooperating school is replaced by a 'substitute' which 
matches the selected school as closely as possible on several attributes (for instance, on the stratifying 
variables and the enrollment ). ( 'haracterist ics of the non-cooperat ing st udents an 1 not known, and t lierefore 
a scheme for their replacement by cooperating students is not feasible. Instead, the sampling weights are 
adjusted to take account, of the 'missing' observations. In decisions about (approximate) sample size due 
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account is taken for the ex peeled proportion of non-cooperating students, as well as for differences between 
est imated and actual wit bin-school enrollments and for a number of contingencies. 

An important consequence of non-cooperation (nonresponse) is that the weights, proportional to the 
reciprocals of the probabilities of being selected into the sample are not proportional to the reciprocals of 
the probabilities of inclusion in the sample. When iionresponse is informative, us ; ;g these design weights 
would result in biased estimation. The design weights are therefore adjusted for nonresponse; for details 
of weight adjustment in NAEP Trial State Assessment , see KofTler (199!). Since the adjustment depends 
on the sample 1 drawn, the weights are random variables (a different sample may result in a different weight 
adjustment even for a student included in both samples). Of course, a student responding in one survey 
might not, respond in a hypothetical replicate of t his survey ; however, we have no information about the 
consistency of the pattern of nonresponse. 

As a naive model for weight adjustment we consider an underlying mean weight for each student, 
averaged over all possible samples (or all t hose in which the st udenl would be included in t he sample). Sine- 
weights are invariant with respect to constant multiples and are proportional to reciprocals of probabilities 
it is advantageous to use the logarithm scale. For each subject / we posit the model 

log( ( r,-..,) = log(:r,) + -)...., . ('1) 

where is the sampling weight assigned to student / when sample 5 is drawn, w, is the geometric mean 
of the weights for student i, taken over all the samples .s, and {";,-,,}., is a random sample from a centered 
distribution. The variance var(-., ), taken over the hypothetical samples a is a measure of how influential 
the nonresponse is. The clustered nature of the weight adjustment can be incorporated by a variance 
component model: 

log(u-,-;*.») = Iok(hvj*) + -,«;', + 1^, (5) 

where {";^.',} and are two mutually independent random samples and i/-,^. is the geometric mean 

of the weights for student ijh. 

In surveys that are carried out on well-researched target populations it is advantageous to adjust the 
design (sampling) weights so as to bring them into accord with information about the target population 
external to the survey (various 'official' sources, censuses, and the like). This is referred to as posts! ratifi- 
cation. Poslst ratification is not applied in NAEP Stale Trial Assessment. 

3.1.5 Estimation of proficiency 

The proficiency scale is defined in relation to the cognitive items. The proficiency of a student is estimated 
using item-response models; see KofTler (1901) for details, and Mislevy and Hock (1982) and Lord (1980) 
for background. In order to adequately represent the variation of the estimators of proficiency for each 
student the proficiency is represented by a set of live draws from the estimated posterior distribution of 
the proficiency. Specifically, the item-response method used (Mislevy and Hock. 1982) yields an estimated 
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(list ribut ion of the umlerl yiug paramoim. from which five random draws arc obtained and a set of five 
jilaiisibli values is calculated based on tin- drawn values. 1-,'st iinnt ion of profirirury scores is improved by 
roiidit ieming on Severn! background variables. For details, see .Johnson and Allen (1D92. Chapter 11) and 
Mislevy and Sheehan (l!)i)l). 

In general, estimation of any parameter is carried out for each plausible value (live analyses), and tin- 
mean of these estimates is adopted. Formally, let y = {;/,/, },-,, be the ;i x 5 matrix of plausible values for 

the entire sample, and //;,. h = 1 ij, be tin- estimator based on the /j|h set of plausible values. Then 

the mean // = XT/, /'''/'^ ' s (no adopted estimator. 

To emphasize dependence on data we write- //;, = /f(y ; ,) where y h denotes column /) of y. Nod- that 
for est imajiors linear in (lie data, such as the weighted mean, // is equal to the same estimator using the 
wit hin-subject means of the plausible values, that is // = //(y). where y is tin- vector of row-wise means of 

y- 

For est i m a 1 ioa of t In- sampling variance- t he estimators of the mean of the sampling variances from t he 
live analyses is supplemented by the variance of t In- estimates; 

var(/<) = E{var., (//;,)} + var/,(/>;,) ((j) 

(the subscripts s and /; indicate- averaging over samples, and over the five sets of plausible values, respec- 
tively). 

ll is instructive to consider a plausible value //,/, as a sum of the overall (siihpopulauon) mean, p. 
deviation of the student's proficiency //, from the mean //. <S = //, - //. and deviation of the plausible 
value- from the proficiency, r,;, = >/,/, - . Assuming that these t wo sets of deviat ions, {e, } and {?,/, } , are 
independent, a desirable properly of any procedure for generating" plausible values, the proficiencies {;/,}, 
have smaller variation than I he plausible- values { ,(/, ;,},-, fe>r any li. The difference- is the- variance- of the 
plausible- valm-,% around the preilirintfy score. 

\ar(.i/,,, ) = var(c, ) + vnr(r,j, ). 

Note- lira! for the wil hin-st ude-nt means e.if plausible- values (/,- we- have 

var(f/,) = var(e, ) + ^ var(f, )/5. 

j 

The-se' means exhibit more variation than the proficiency scores; the- variance of the- latter is var(e\ ). It is 
therefore not appropriate te> carry out a single- analysis using the- st ude-nt-wise- means . 

3.2 Randomness and conditioning in inference 

In surveys, as in slat ist ical pracl ice in general, we are- int eresteel in sampling distributions of estimators. An 
orthodox view of the' sampling elist ribul iem of an est iijiator in a survey is to consieler t he elistribul ie>n of t he 
est iinat<-s e>f a paranie-te-r in a large- (infinite-) nimiber of (hypothetical ) r<-plicat ions of t lie survey. The- goal 
in a typical estimation prohlem is tej make- infere-nce- about such a distribution, baseel on a suiglr rrali:nlton 
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of tin- survey. Clearly, foal lift's of the survey have to he utilized to compensate for lack of replication. 
For ;i survey with a complex sampling design, non-response, and imperfect reliability of the response (for 
instance, due to measurement or estimation error), a hypothetical replication will have different students 
and schools in its sample, it may have a different sample size (different numbers of students ami schools), 
but even the response/outcome of die student who happens to be Selected in both surveys will be different 
(students', or indeed, our responses to even the most ubiquitous survey items are known not to be perfectly 
reliable). 

Imeonditional inference, averaging over a largo number of hypothetical survey?., ij, often a tall order, 
and in practice inference is conditioned on the selected sample. It is meaningful, we believe, to consider 
conditioning on the sample that would have been obtained had each selected school and individual fully 
cooperated. Such a conditional inference is difficult to conceptualize because selection of t he students is 
conditional on selection of their schools, (some) schools that refused to cooperate were substituted by other 
schools in the survey, and so on. Moreover, *' school that failed to cooperate in tin 1 realized survey might 
cooperate in a hy pot helical replication of the survey. 

3.3 Jackknife 

This section describes the jackknife met bod used for estimation of population and subpopulat ion means 
and their standard errors, The jackknife is a general method for reduction of bias of estimators and for 
estimation of their sampling variaiu es. We de-scribe t he jackknife method as applied to NAFF State Trial 
Assessment . 

The mean proficiency for tin-. stale is tM minted b\ I he Weighted mean 

« 

/' - — • ( i 

L, ; * "■<./»■ 

C 'output at ion of the .sampling variance of this estimator presents problems arising from complexity ol 
tin- sampling design: unequal probabilities of selection, clustered sampling design and adjustment for 
noiirespoiise. 

Fach of tin- A' — "jti replicate groups (whet her empty op not ) is associated with . /wr udtmnctli/hhs carried 
out on a ps( udosampU . In groups with more than one cluster the clusters are assigned order (first, second, 
etc.) at random. If group k contains two clusters then the pseudosample for psetidoaualysis /■ is created by 
replacing t be first cluster in group k by t he ot her cluster in t Ik- group. This is equivalent to doubling t ho 
Weights for all students in tin- second cluster. For groups with three clusters tin- first cluster is removed, 
and the weights for t he st udents in tin- ot her t wo clusters are multiplied by 1 .-T Thus*, each st udent in the 
sample is associated with A' 4 1 = ">7 nplicutt W(i<jlils. 1'hese weights an- given in tin- NAFF dataset. If 
p:>stsl rat ificat ion were applied the replicate weights would have to be adjusted by postsl rat ificat ion of the 
pseudosample. When carried out operationally, this represer , a substantial computational load. 

The k\\\ psendoanalvsis evaluates the estimator (7) using the h\\\ set of replicate weights; we denote 
this estimator by /(" '. flu- jackknife estimator of tin- mean /i is defined as the arithmetic mean of the 
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put ud(Kslimalovs 



Z/' 



/'-> = — ■ (8) 
The sampling variance of fij is estimated as the sum of squares of devia! ions of I he pseudoestimators ) 
I'roin tin- jackknife estimator //,/: 



■■•ir(//j) = ]T(/> (n -/0)'- 



k 



see Walter (.!)*")) lor details. Note thai Tor estimation of the population mean only groups with two or 
more clusters contribute to the sum of squares in (9). However, for snbpopulat ion means the same ordering 
of dusters is used, and if the snbpopulat ion is represented only by one non-empty (duster in the dataset 
which happens not to be the first cluster, the group does make a contribution to (In- sum of squares in (J)). 
In practice the estimator (7) is used instead of (8), and (<)) is used as tin- estimator of sampling variance 
ol (7). In brief, t he jackknife is used only for sampling variance estimation. 

The jackknife method given by (8) and (9) appears to be easy to implement, although the size of l he 
dataset is substantially inflated by the replicate weights. In NAKP Trial State Assessment the students' 
records have length of about 1700, but two sets of replicate weights take up more than 800 columns, 

Note that instead of the weighted mean (7) other statistics or estimators can be used as the 'parent' 
luethnd lor ! he jackknife. Ordinary regression is an important example. 

Our study focuses on met hods of est jmat ion of the population mean, and nfthe sampling variances of 
these estimators, that depend on the design only through a single set nf weights, the clustering, and the 
st rat if'tcat ion. 

3.4 Model-based methods 

1 his section describes the model-based approach, of I'otthoff it til, ( 1092). as applicable to the XAKP Trial 
State Assessment. In general, an estimator for the quantity of interest (say. the ratio estimator for the 
papulation mean) is considered, and its sampling variance is expressed as a function of the modelled features 
ol the design. Typically, these features include clustering and stratification. Clustering can be represented 
by one or several variance components and stratification by st raiuni-speciuV means (parameters), 
lor tin- XAKP State Assessment we consider tin- superpopulat ion model 

II, jk = IH- + V + ;, ]h . (10) 

where the group means are unknown constants and o jA . and are mutually independent random 

variables wit li zero expectations and respective variances cry and rrj',. )k . The wit hin-elust er variances cr'^- jk 
are positive and unknown, and the helween-c luster variance rrj, is a non-negative constant. Note that <r?, 
is the covariance of two observations in the same cluster: 
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Often a common withir.-rhmter variance is couriered. rr'{ V]k = ct} v . This is not a realistic assumption for 
NAFF Stale Trial Assessment, however. 

We consider 1 1 if weighted mean, or [he ratio estimator, 

= »V*g .i* (12) 

Assuming (10) and appropriateness (1 f { \ w weights, that is. they are proportional to the reciprocals of 
sampling prohabilit ies. // is an unbiased estimator of the siipcrpopulat ion mean 



where \\\. = £ ir t]t and W = £ 4 \\\. \W <leuole Wy, = £, ir, 7t , W » = 1 1}* 1 £, «•/;*//„*. ; >'« l 
;, 4 . = If- 1 £ u-fjkU.jt ■ Note that /f j( ,. and are unbiased estimators of ji k . 

3.4.1 Tho sampling variance; of tho moan 

Following Polthoif tt at. we consider the weighted means fij k as agyuynlt observations. We use the 

'effective sample size' normalizal ion of t he weights; we set 

so that Ij.-j 

and denote this total of weights by M,i jt . In a general context. I'oHhoJJ <t at. (1<H)2) refer to n,,. Jt . as 
the 'cHeclive sample size*, to emphasize a connection with the number of 'degrees of freedom" of certain 
variance est imalors. see Section '.\A.2. For X A FT. >U.jl' ran be interpreted as the effective sample size of 
cluhter jk\ although these quantities cannot be compared across clusters. A counterintuitive example arises 
when there is a cluster with a large number of students, each with very small weight, and another cluster 
with a small number of .students, each with very small weight. For non-empty clusters 1 < h, A j* < ■ 
and ))., Jt . approaches these ext rentes whim the cluster contains a single observation with dominant weight 
(if,., ,i = l}. and when all the weights are almost constant {n A .jk = »jk}- The latter is the case in the HlflO 
Math Trial Slate- Assessnien! . 

Tin- wit hin-fluster weighted mean is 



and its variance is 



1 f ; ; •. ■> \ 3 , ""'ji- 

"Xjk \ I " Ajk 

The statistics )t }L , are niutually independent linear components of I he estimator (12); 
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H" 

We (Mine I lie elfrctive duster sample size as 



and tin- iuiririiili?(>i| weights as 



\7 



ir 



'Us 



anil I he sampling variance of/; is 



var(,/j = -r^KiJ^ + J^ 



(18) 



Tims tin- sampling variant of// depends on I he variance comment* ^ and 4. , . The next secli 



H "Wjh ■ 1 '» li^Xl section 

deals with estimation of t hese variants. Alternatives. disn,,sed later. include in.puih^ values for these 
variant and applymgsniootliing techniques ,0 improve estimat inn of var(/,) by pending information across 
snhsaniples. 



3.1.2 Estimation of lh<> variance components 

1 wit hin-cluster variances 



I here are }2k = 1 "U- + 1 unknown variance parameters {cr'( Vjl .\ anting in (IS). The ■ 



n-jv.jt. can he estimated as the weighted wit hin-cluster corrected sums of squares 



1 V" 

'"* 4j *' ~ „,.,_, Zw u 'A.'MU>jt -/';*)"■ (19) 



They are unl.ia.M-d est imators of cr';,. , ■ 



E(c., jA .) = 



"'■» <;*- v; "t</o*') ~ ".■ij<var(// ;i .| / , jA .)| 



(20) 



If a common within-clnster variance ^. is assumed the weighted sum* of squares { t j ,} can be pooled: 

Hi 
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= V" i) 1 (21) 

is an unbiased estimator of tlx 1 common «■ i I hi n-rl jitter variance cr^ v . The definition of the effertive sample 
sizes H.^.i* is motivated by unbiasedness of the c timators in (20) and (21). Also, these estimators have 
approTimalc distributions \ J with the decrees of freedom given by the denominators, see Pot I half ct al. 
(1992). 

For estimation of the bet wren-cluster variance rrj ; we consider weighted witbin-group sums of squares; 

' " = Yl -/'*)". (22) 

where {fjt-} is a suitable set of non-negative coelficieuts (weights). The expectation of r/j is 



E(i /i) = { var (/'j*d + var(//<.) - 2co\(/i;;.. //(.)} 

/* 

j* 

where 

The development thus far imposes no restriction on the coefficients t/j*-. Obvious choices for them are the 
totals of sampling weights. It'jt- and the- effective sample sizes. >i .\n-. 
For = H j t . (2.'5) simplifies to 



trv ir * / v - 

A drawback of the scheme based on iiji- = is that the effective wit bin-cluster sample sizes ?)..t.ii- 

cannot be compared across clusters, and may he very misleading when the sampling weights have a large 
bet ween-cl uster component of variation. Neither of these choices for {iiji} takes account of the differing 
wit hi n-rl ust or variance's ct^y . or of t he diT'M cut iai rout ribut ions of t he clusters to the wit bin-group sum 
of squares !■/■; in (22). 

The within-cluster "-eighted ni<ans n ]k . have an approximate normal distribution, and so the squared 
deviation (/i^. — //;.)-' has a \ -'-like distribution. Thus var{(//_,j. - in ) 2 } is approximately proportional to 
the square of the expectation. The optimal choice of is given by the set of coefficients for w hich the 
variance of 17; is minimized (subject to a constraint, such as E* Ej ' s <,( l u;i ' lo a constant). Assuming, 
for the moment, that rrj ( is known, and ignoring the interdependence of the squared deviations, we obtain, 
using (17). the optimal coefficients 

I 

(25) 
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The wit bin-cluster variances cr^- k can be replaced hy their estimates in (l'J). In the absence of an estimate 
of a], a guess has to he used. It turns out that tin- accuracy of this ginws is not critical: for instance-, setting 
(Tj'j = 0 in (25) is often adequate. 

As an alternative, the reciprocals of the contributions to f'JH) can he calculated using an estimate of 
(T]f obtained by oik- of tin 1 other methods. In principle, this recursive algorithm can be applied until 
convergence, hut changes after the first iteration are unimportant . 

I he bet ween-cluster variance <tj« is estimated hy t he met hod of moment s: 

Note that f'i(- = 0 holds only when s.ralnni /• is represented in the- sample by a single cluster. For 
such a cbister and stratum /ijj. = and = /i k . and so these clusters make n>> contribution to t lie 
sitm-of-Mpiares statistics r«_> or rn . 

In Math HMlf) Trial State Assessment most groups contain two flusters. As an alternative to the 
estimator (22) the following class of statistics for estimation of the bet weeti -cluster variance can be used: 

when- tin 1 sum mat ion is over all groups with at least two clusters, and {in- } an- suitable weights (constants). 
A group with a single cluster in i Ik- sample cannot cont rihute to est i mat ion of 1 he hot ween-cluster variation, 
and so the only apparent loss is dm- to tin- groups with more than two clusters. The on.lv advantage- of 
(27) d\er r/t in (22) is in relative computational simplicity, for a group wit h two or more dusters we 1 have 



V CTj; 

J - 1 J = '■ 



E(/'u- - li-H-f - Y]^(fijk) = '-'^/f + y ■ (28) 



and 



E(.,) = y. «* 2rT » + L 1 • m 



This, ti igi-t her with (21). yields a class of moment est imators of <r; ( : 

t '2 2L«* 1 ^j = l II..,. u 

<r w = . (,iO) 

whore the summations for /• art- over groups with at least two clusters in tin- sample. In analogy with the 
schemes for r/; we consider the following choices for tin- coefficients ii k .\ 

• tin- within-group total sani|ding weights \\\. . 

• the total of the elfecthe sample sixes n,\ji- + U,\/>k> 

• weights inversely proportional to the expected sum of squares in (2S) under c ' r = 0. 



1M 

2S 



Table 2: Coefficients for the within-group sums of squares i /j ami r 3 



as III. with a\ in place of rrj, as III. with cry, in place of (t\ 



N'olc: Fstimators of the hetwwit-duster variance (r 2 B are referred to by the combination of (lie sum of squares (SSQ) 
used B or C and (lie method (choice of coefficients). I. It, 111, or R. 



Alternatively, the variance a% can be estimated by (30) using one of these sets of weights, and then 
reestiinated using the weights inversely proportional to (28). Of course, this recursive estimation scheme 
can he used until convergence is achieved. However, after one such iteration the change in erf, is usually 
unimportant. The motivation for these sets of coefficients is analogous to their counterparts for (26). 

The choices for the coefficients Ujt in c/j and lit in v-> are summarized in Table 2. Examples of t hese 
estimators are given in Section 3.5. We refer to the estimators of a\ and to the estimators of the sampling 
variance of ji by symbols "B" (based on (30)) or "C (based on (26)). and T (coefficients \\\ or U)<.). "H" 
(coefficients n A , k or n, t .u. + n,i, 2t ). ill' (reciprocals of the expected contributions to v-> or r/j assuming 
a 3 n - 0). and - R' (reciprocals of the expected contributions calculated for cr% estimated by the method I). 
The estimates of the variances af v jk and a% are substituted for their true values in the identity for the 
variance of //. ( 18) . 

If we insist on interpretation of a'' B as a variance then negative values of a' n are not admissible. If for 
each negative value of (30) the estimate of a% is set to zero, as is often done in practice, the resulting 
estimator is biased, especially when the true value of (he parameter cr\ is close to zero. On the other 
hand, h- = rrj, can be interpreted as wit Inn-cluster covariance. sec (11). and then its negative values are 
admissible. The minimum within-cluster covariance k that can be realized for a clustf of size Xjt is 
_l/(,Vj* - 1). Note, however, that the sample cluster size rijt may be much smaller t ran the population 
cluster size ,\" b ; a negative estimate of the covariance k, may be realizable for the sample selected from 
the cluster, but not for the entire population of the cluster. 

3.5 Examples 

The jnckkiufe and model-based methods for estimation of the mean are illustrated on a few examples 
using I he data from New Jersey and Oklahoma. Adjustment of the weights due to nr.! i response is ignored 
throughout the section, but it is explored in the next section. 
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able :\: Jackknife analysis. YM imnl ion of the population mean of pmfir iniry s.ores. 



vow Jorsoy Plausible values Overall 



Weighted mean 
Jadikuifo mean 
.J K si ami . error 


2r;'! 17 

2(5!), IS 
1.01 


2(511.42 
2(5<l,i;5 
1.07 


2m :m 
'im.:\7 
1 .(i.i 


20:1,10 
2(5i),I2 
1.00 


2m M') 
2(><).0. r > 
1.01 


2(5!). Hi 
2(i ( M7 

1 .0.1 


Oklahoma 















Weighted mean 202. <i.") 2(12.71 262. 78 2(52.71 2(i2.(i;{ 

.lark knife iii.-an 202. HI 202.71 202.71 2(52.(58 202.00 

•IK stand, or t«t 1.2:5 1.27 1.21 1.2! 1 ■>■> 



noma. 



3.5.1 Population moan 

Fsiimatioit of 1 ho population moan by jackknife is summarized in Table 3 for Now Jersey and Oklal 
In essence, separate jackkiiife analyses are carried out for each plausible value, and (ho estimators based 
on the plausible values are nnuljiii.nl into (ho 'Owrall' estimator which takes account of variation due to 
estimation of hi,- prolicieney scores. The estimate of the population moan proficiency is the average of the 
estimates of the mean based on the five plausible values. Tho sampling variance of tin- estimator of the 
population mean is estimated using (0). The estimates for Now Jersey are based on 271!) sli.de.Us from 
101 dusters, those for Oklahoma an 2222 students from 108 clusters. 

The diirorenees between the weighted means and the jackknife estimates of the moans are inconse- 
quential; note however, that the d iflWnires for Oklahoma appear to be consist out. though negligible. For 
most purposes Hi,- statistics based on plausible values 1 are of no interest, and their summaries in the 
right-most column of Table :i are used. 

Mod, bbase.l estimators of the standard error of the weighted moan are summarized in Table -1 for 
(he four estimators based on each suni-nf-squares statistic, r, and v„. The estimators HI and CI require 
least computation, owing t,, simpler equations for estimation of cry, . and the estimators 1)1? and CR most 
(almost twi.-o as much as 1)1 and (T respectively). For completeness, the second row oftho table contains 
I hi' pooled estimates of the common wit hiii-dnster variance rr'L. 

The eight sols of estimators of the standard error are within a range of 0.01. but they dilh r from 
'lie jackknife estimate by about 0.1a (almost 13 per cut). Dased on this analysis wo cannot arbitrate 
whether such a difference is due to sampling variation of tho estimator of the standard error, or whether 
fhe.jaekkiiifeaml modd-ha-ed estimator, haw different biases (or indeed, whether t he jackknife 1- unbiased 
and iiio,|ol-bas,d I'si imaioi's ai - '- not) 

lable siinmiaiixes model-based estimation o| tho population mean for Oklahoma. Contrasting tho 
analysis f, „• New J.-rs, y 1 ho model- based estimates for Oklah, ,111a are vers dose in I lie jackknife est imat e , if 

w 3 0 
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'IVihle 1: Modeldiased estimators of standard error of the- u-oi^hiod moan. 



Now Jorsoy 
Plausible values Overall 



3 



Weighted mean 2S)iI.47 200,12 20SU7 20S),I0 2(i!Ui. r , 2(>!l. 10 

of,. B7S.BS ' (U).").81 6(53.89 (>78.S0 (i(i7.7. r ) G71.o!) 

Method BI 

rrjj DD.3CS 10 !. 17 SI7.S3 100.1(1 SIS 20 100.01 

Si andar.l error LIS) 1.21 1.17 I. ID 1.18 I.HI 

mi 

c% 97.(58 !)S).2.") Jll.ll !I8.<I() SKI. So (JR. 78 

Si aiulard error 1.18 1.18 I. II I. Is 1.17 1.17 

Bill 

aj, 101.21 107.11 SM.S2 107.10 101. oO 103.07 

Standard error 1.20 1.22 I . Hi 1.22 1 ,2J 1.20 

BR 

n] t SKi.S.") SID. 2D DO. 20 073 Dti.20 DO. 02 

Standard error 1.17 1.18 1.11 1.1S 1.17 1.17 

CI 

07, 107,10 10S.2SI 100.07 101.32 1(13. Dl 10 1.20 

Siandar.l error 1.22 1.22 1.1s 1.1S1 1.21 1.21 

C1I 

rr'fi 101. (il 101, Hi <X>M SID. 71 101.3.") 101.11 

Standard error 1.21 1.21 l.H) 1.1!) LIS) LIS! 

cm 

a'j, [mm 102.0") 91.00 SID. 03 100.30 SI0.SI2 

Standard error 1.20 1.20 l.Hj I.HI 1.1 SI l.H) 

CR 

rr% 101.7,3 1UI.SU Do. 2D 100. C") 102.02 ' 101.18 

Standard error 1.21 1.21 I. Hi l.H) 1.20 I. HI 



\uli'-: f.sl inial ion of the population mean of prolirieurv seops Tor New hrsev. ] he nielliods are deseribed in I lie 
I' '»"! .Hid in la I >1< ■ _': rr[y is tile pooled estimate of tile wit liin-ehist cr variance; a) t is tile eslininlo of lli<- helween- 
'liislri varialu < . I 'lie esli mates are given for < a< li plausible value and for I lie proline liry s< ore (milium 'C)v< rail' |. 



Table "y. Model-based estimators of standard error of the weighted mean. Estimation of (lie population 
mean of proficiency scores for Oklahoma. 



Oklahoma 

Plausible values 



Overall 



3 



Weighted mean 



Standard error 



St andard error 



Standard error 



St andard error 



" l) 

Standard 



St andard error 



Standard e 



Standard error 



262.8-5 
fi 51.7 5 

1 14.60 
1 .22 



118.12 
1.24 



103.71 
1.18 

120.18 
1 .2.5 

121.29 
1.2") 

119.13 
1 .24 

121.80 
1.26 

121. 11 
1.20 



262.74 
072.11 



202.78 
675.53 



Method BI 



119.77 
1.25 

123.49 
1.20 

1 1 1 .82 
1.22 

199.58 
1.25 

128.09 
1.28 

125.04 
1.27 

128.80 
1.27 

125.58 
1.27 



BII 



Bill 



112.07 
1.22 

110.43 
1.24 

101.29 
1.17 



BR 



CI 



CII 



cm 



CR 



114.54 
1.23 

118.91 
1.25 

117.22 
1 .24 

120.10 
1 .25 

118.95 
1.25 



262.71 
678.84 

116.81 
1.24 

121.29 
1.25 

113.47 
1.22 

118.70 
1.24 

120.82 
1.25 

119.01 
1.24 

122.52 
1.26 

120.73 



262.03 
652.18 

1 14.08 
1.22 



7.10 

! .23 



104.88 
1.18 

1 15.25 
1.23 

1 19.20 
1.24 

1 Hi. 28 
1.23 

1 19,19 
1.24 

1 18.50 
1 .24 



202.70 
006.08 



1 15.58 
1.23 

1 19.28 
1.25 

107.03 
1.20 



.7.05 
1.24 



121.06 
1.26 

1 19.46 
1.25 

122.56 
1.26 

121.58 
1.20 
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tli.- standard error. Tin* pooled estimates of the wit liin-cluster variance for New Jersey and Oklahoma are 
alike, as are the bet wren-duster (wit hill-group) variances. The latter variances are very large, considering 
purposeful grouping of I lu- clusters into groups. For example, the estimated within-cluster correlations for 
Oklahoma, using method 1. are around 1 15.58/(115.58+ 666.68) = 0.15 . Without adjustment for group 
(stratification) these correlations are much larger (around 0.35). 

The elapsed time for the analysis producing the eight model-based estimates displayed in Tables 1 and 
5 is less than twice the elapsed time for the jackknife analysis. 

Of principal interest in Tables -1 and 5 are the right-most columns (-Overall') giving est imat es thai take 
account of inaccuracy in estimation of the proficiency scores. 

3.5.2 Subpopulation means 

Table 6 displays the estimates of the means for a selected set of subpopulat ions in New Jersey. The 
subpopulation is characterized hy the questionnaire item, and response; for instance. (183.2) in the first 
col in n nor the table signifies the subpopulat ion of students who responded "'2' (Graduated from high school) 
to item number 183 (Parent's educational level), for each subpopulation the corresponding sample size 
and number or (non-empty) clusters an- given. The standard errors estimated by jackknife are given in 
parentheses underneath the associated estimate. For the model-based methods the standard errors are 
given in parentheses, and the estimated bet ween-f luster variances in brackets. The methods B (for pairs 
or flusters) differ from their method C counterparts, but the differences are insubstantial in comparison 
with the estimators within a method, especially Tor small samples. To conserve space, only results for the 
met hod ( ' are given. 

There appears to be considerable agreement between t he jackknife and model-based estimators or the 
standard errors, especially Tor larger datasets (with more than 1000 students). On the other hand, among 
the estimated standard errors Tor small datasets there are considerable differences. It is feasible, however, 
that they merely reflect substantial sampling variation. For instance, the dataset for item and response 
C28.2) (Asian American students), contains 131 studen!s. 14 or whom are in a single cluster: oh lie remaining 
5b non-empty clusters only US contain more than two students, and none contains more than six. 

3.6 Adjustment of weights for nonresponse 

For purposes of statistical analysis, adjustment is commonly interpreted as a perturbation of tin- sampling 
weights. The adjustments fur a student in the sample drawn may be different from the adjustment m a 
different sample in which the student is also selected. This creates problems with all methods that rely 
on the sampling weights being constants fixed prior to selection of the sample. A simplistic approach to 
dealing with such adjustment is to ignore the stochastic nature or the adjusted weights (their variation 
over samples), and proceed with the analysis as ir the adjustment of weights look place prior to sample 
M-lcdion. "I hit. approach is reriaink juslilied wlu n the weights are altered only marginally. This is the 
case in ihe New Jersey daias.t but not in the dataset for Oklahoma. In this section we show, though. 
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TaWe fi: Jackknifr and inoiM-hasrd estimates for a selected set of subpopnlalions: New Jersey. 



Now Jersey 



Item 


St lldenls 
and 


Weighted 

mean 


JnfkkmTt- 




M< 


(hod C 




response 


clusters 


and <r^. 


I 


II 


III 


R. 


28. 1 


1789/95 


279.1!! 
662.56 


279.5-1 
(1.06) 


(1.1(1) 
[67.50] 


(1.1.1) 
[60.63] 


(1.28) 
[81.29] 


(1.33) 
[90.74] 


28.2 


308/01 


2-10.79 
397.6-1 


210.83 
(2.30) 


(2.3-1) 
[73.73] 


(2.12) 
]52. 88] 


(2.30) 
[76,15] 


(3.20) 
[110/11] 


28,1 


131/57 


25)6.97 
600.03 


296.99 
(-1.02) 


(5.0-1) 
[116,13] 


(o.I7) 
[ 90.92] 


(5.23) 
[160.63] 


(■1-83) 
I' 17.96] 



31.1 656/25 ' m ' M ( 2 -0» (1.97) 

71!iJ1 tt-MJ [72.66] [58.76] [55.28] [53.68] 

31.2 1 1 70/-Io 2?! ' 79 27K8 ' ! (1 ' 51) t 1 -^) (1-37) (1.58) 
(i00:i7 C 2 -"-') [^^) [66.69] [(J7.3!)] [68.10] 



183.2 633/102 2 ™' 87 ' 2h *- m ( L7(i ) d-TI) (1.76) (1.56) 

;)7,S!I tl-3») [13^.10] [127.06] [139.69] [95,1k] 

183,1 1225/101 281 -' i8 28l ' !i8 tl-10) (1.42) (1.41) 
71fi87 (J£5J [»«-o«] [8!). 30] [100.87] [98.19] 



193.1 386/3-1 2817:5 281 - 70 ( ' U8 ' ( 3 -*») C1IH) (-1.02) 

5l5 ' 2: > [36-1.5**] [277.57] [322.51] [306.86] 

193.2 677/61 2725)7 2731,0 (211 > Ci-IB) (2.60) 

5601 (2- r w) [1)0.10] [1-10.76] [221.61] [223,16] 

193.3 [3-10/91 mM H.89) (1.85) (1.98) (2.06) 

580.20 (1.97) [193.71] [183.92] [212.05] 



193.9 :j()l/3H ' mM mM ( ;5 - 8S ! ( : *^) (-l- r )9) (5.29) 
W'-M (-'■"!) ["»-0 'J] [36.80] [127.87] [103.27] 



Notes: lor ,,ul, method Ihe estimates of 4 arc given i„ brack-is, []. and t |,e estimated standard errors in 
parentheses | ). Ihe items and response options are: IK Derived race/e) Imieilv ( I White..' Black -1 Asian)- 
il Minority stratum; 

18-i Parents' educational level (_> Clraduatrd from hi R h school, t C ruinated from college): 
1!M 'leather's graduate major (I Mathematics, J Kducation. ! Other.!) Missing). 
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Figure i: Stmli'iil devel weight adjusTmenl t'nrtors for Oklahoma. 

that tliis approach involves lri\ in] ini precision even fur Oklahoma. so long as the adjustment of weights is 
worthwhile. 

Tlit> design weights hav two limit iplicalive components, school- ami student-level design weights. The 
srhonl-levcl weights are proportional to the reciprocals of the probabilities of inclusion of the school tn 
the sample, assuming that all schools would cooperate. Note that these weights refer to schools, not to 
clusters. The student design weights an- proportional to the conditional probabilities of inclusion in the 
sample, given that the school is included. 

Adjustment of these weights due U< non-response also has two until iplicat ive components one for 
dusters and one for students. In the New .Jersey dataset there are only three different clusier-level ad- 
justment facte rs: I. ODD (no adjust men! ), 1 .1127. and 1.0-1(1, for ID. .ST. and 21 clusters, respectively. The 
student-level adjust men! factors are in the range I. Dili 1. 1.10, with mean and median equal io l.Dfi. and 
sample standard deviation equal to 0.021. In summary, the adjustments of (he weights (product of the 
school- and student -level adjust meats) are in the range 1 .01 (i 1.171. with mean and median equal to 1 .08 
and standard deviation equal m D.OH'2. 

In contrast, in the dataset for Oklahoma ( IDS schools with 2222 students), where the non-response was 
much higher (about '20 per cent at student level), the adjustment ,,f weights is much more substantial, 
'[die design weights ( hot h school- and si in lent - level ) are const an I wil hiu schools, and so an- t Ik- school-level 
adjustments. The student-level adjustment factors ha\e :i. r ) distinct values, two in most clusters. These 
factors are in the range- 1.00 see figure 1. 

The school-level adjustment factors are much important. Kssetii ially, there aie three distinct values 
of the factor: for tit) schools ( 1 ISI.'j student*! I lie fa.' tor is equal to 1 .00; for 42 schools with 80a st udenls t lie 
adjustment factor is 1 .01 o 1 .0 1(5: and for tin' remaining fi schools ( 1.V2 students) the adjustment factor is 
l.lti. New Jersey and Oklahoma represent two extremes among tin- states participating in tin- 1090 Math 

2- r > 
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Trial State- Assessment in terms of noiircsponsc and consequent adjust incut . 

Approaches other tlian jarkkitife to inference from survey samples regard the adjusted weights as design 
weights, ignoring t he stochast ic naf lire of the adjustments. We explore whet her such an approach is just ified 
for the model-based estimators by a Monte Carlo study in which the sampling weights are perturbed by 
random terms with suitably chosen dispersion. Instead of the realized weights U-jjj, we consider a set of 
'perturbed' weights, generated by the model 

logo,.;;') = logfH-^j + ^' + .^V. tan 

where {c'' s ' '} and { - } me tw> mutually independent random samples from A'(Q,rrj and A"(U. cr~ [v ). 
respectively, and ir*^. is the mean weight for student ijk. averaged over all the samples. The choice 
of values for the variances a^, n and cr xv w is discussed below. Note that exp(r^. ') and exp(-:]J t ') are 
essentially different from the weight adjustments; if they were not, weight adjustment would have no 
stochastic component. The model in (31) assumes independent deviation factors for clusters and students, 
both log-normally distributed. This assumption cannot be checked because I hp "true' weights ir'jj are not 
known. Also, the weights {ir~ jk } are defined subject to a multiplicative factor constant for a sample. In 
(31) a suitable constant factor is assumed. There is no evidence of dependence of the applied school- and 
student-level adjustment factors. 

We adopt t he 'working' assumption t hat t he sampling weights are unbiased est imators of a fixed multiple 
of the reciprocal of I he sampling probabilities, ami that the adjustment of the design weights at both cluster 
and student, levels has the following properties. Each adjustment factor has two components: adjust incut 
ef bias of the design weights, and a random component. We assume that the variation of the random 
component is of the same order as the variance of the bias of adjustment, that is, the adjustment is 
reasonably efficient. This suggests the choice of school- and student-level variances erj' /( and er~ lv of the 
same order of magnitude as the sample variances of the logarithms of the adjustment factors. 

The adjustment factors for New Jersey are so small, that even for an nn realistically large perturbation of 
weights the weighted means have observed variances negligible in comparison with the estimated sampling 
variance of the estimator for the population mean. We chose the variances a:, = (1. ()•")-' for schools and a\ — 
0.1- for students. To illustrate this perturbation, the basic descriptive statistics of the adjustment, factors 
are compared with a random sample from the distribution used for perturbing the sampling weights. The 
basic descriptive statistics (minimum, median, mean, maximum, atul standard deviation in parentheses) 
for the school-level adjustment (on log-scale) are: 

0.000. 0.018. 0.027. 0.015, (0.018) 

Tin- same statistics for a random sample of size 104 (tin 1 number of schools) from tin- distribution generating 
tin- perturbation factors is 

0.002, 0.04.1, 0.037, 0.157, (0.034) 
The corresponding statistics for the student-level adjustments and perturbation are 

0.016, 0.058. 0.05(5. 0.140, (0.022) 
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and 

0.000, 0.08U, 0.008, 0.394, (0.0(50) 
Thus the simulated pert urhatimi changes the weights, much more than llir realized uljitslint'iil for nonre- 

SpOUSe. 

Oik- hundred sets of perturbed weights were generated for the entire sample ami several subsamples of 
tin- New .Jersey dataset. The mean and standard deviation of the simulated estimates of (he population 
mean are 269.43 and 0.2035, respectively. The jackknife and ratio estimates of the population mean are 
260. -17 and 209.46, respectively. For suhpopulat ions the corresponding differences are also trivial, for 
example, the mean of the simulated estimates of the mean proficiency of die Asian American students 
(131 students in 57 clusters) is 296.91 (standard deviation of tin- simulated estimates is 0/17), while the 
ratio and jackknife estimates are 296.99 and 296.97, respectively. The estimated standard error of these 
estimators is aroud 5.0. The variation in the estimates of the sampling variation is also unimportant. 

A similar analysis for Oklahoma yields somewhat larger differences. The jackknife and ratio estimates 
of the population moan are 262.91 and 262.95, respectively, and the moan of the simulated estimates 
is 262.77 (the standard deviation of these estimates is 0.30). The corresponding means for the Asian 
American students (36 students in 25 clusters) are 286.47 (ratio estimate), 286.57 (jackknife), and 286.49 
(simulation). The differences among these means are trivial in comparison with the substantial sampling 
error. The impact of perturbation of weights on the estimated sampling variance is also trivinl. 

3.7 Association of weight adjustment and proficiency 

A simpler, though incomplete, way of assessing the influence of the weight adjustment on the estimate of 
(sub-)population means is based on exploring the association of the weight adjustment with the proficiency. 
For simplicity we consider the first, plausible value as a representation of the proficiency. The estimate of 
the population mean for New Jersey, based on the design weights, is 269.28, 0.21 lower than the ratio 
estimate based on the adjusted weights. For Oklahoma, the design-weight sample mean is 262 95, 0.26 
lower than the adjusted-weight, sample mean. Such differences are no longer trivial although the biases 
incurred are in no way consequential. 

Influence of the weight adjustment on the est imat e of t he population menu is ;i result of associat ion of 
the school- and student-level adjustment factors with proficiency. Figure 5 displays the plot of the school- 
level adjustment of the weights against the school-means of proficiencies (left-hand panel) and the plot of 
student-level adjustment against the proficiencies. The school-level adjustments are positively associated 
with mean proficiency • -better' schools were more likely to decline participation in the survey. On the 
other hand, student-level adjustment is negatively associated with proficiency. Students with lower ability 
are more likely to abstain from the survey. Since the school-level adjustment is on a much narrower scale, 
the overall adjustment is affected only very moderately by the school-level weight adjustments. 

For subpopulations the impact of weight adjustment, varies depending on the stochastic mechanism of 
•selection' of the suhpopulat ion. For example, the weighted means for urbanicity stratum 1 (442 students) 
in New Jersey are 238.01 and 238.14 for the design and adjusted weights, respectively; for students who 
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FiRmv :,: Assurinii,!,, ofehisjer- and simleiit- level weight adji.*l meiil lam.rs will, profi.i,,,, y: Nrw Jersey 

responded -.r (undecided) l.ill„.ii,«„tnl»«iM |J„.ir l«'irq,|inri«.riimlli..iii;ilksi." l : ) .K|i| ( |rt l i h ) these means are 
Wi'.i*t)j.rid 2.1'U'.:!. Kmh differences an- small fraci.ons nf the corresponding «>f mmled .1 andanl drviai i, ms 
oft h<- wiMlll-<l m«n ns. Tli<- .•<,rr.*i>oii«lir. g different fur Oklahoma an- only marginally |j, W *r. 

In conclusion, weight adjust menls have a small, altlinitgli perceptible, imparl <>n estimates nf ;!„• pop- 
ulation and sidmnjmlation means. 

The adjnsl mem of I he sampling weights for ( )klal,oma is comparable In the peri nrhai ion of , [„. „.| Kl „|- 

Irvel VVei^lns l,y .\"(U. H. 1 -') of 1 h<- si !(d e„1 -lev e| fiei?;llls l,y (3.025-). 

3.8 Multivariate outcomes 

li is easy t,» see (hat hot h I he jack knife and I he iiuidrl-hasecl met hods have direct extensions [or nmlllvanale 
e>tit.-omes. Km i, nation of the peculation mean is earned out component -wise, and Urn. sampling va ' 



a nance 



malrix ol'ihe vector of est imaied means lor t he jackkntfe 



UsiiiR Hi-- nolalion analogs m I hai in lor I he model-based mel hods all I he e,p,al i, }„ S.,|„>„ ■£. ] 
;.pph. Willi Ihe varianre components replaced l-y I he rnrreHp„„di u ,, variance matrices. 

3.9 Modelling approach 



ma? 



lik 



In I his sect ion ^e mn-ider an adapt at ion of I he 
ln, "" i ;ul - r,,r r """■ ,, '•H'.-ivn.v we consider ordinary recession instead of , he pop,, lain,, m-an. 



Iihood method lor estimation of the 
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A well established approach to regressiuti analysis of dala from surveys with complex design relies mi 
a siiperpopulal ion model in which survey features an- typically represented as dilftTetices aiming sampling 
1 1 nil s. In 1 1 i'i ■ case of simple - regression ii is nal ural |o consider regression uf ,1/ <ni ,/■ with coelticiouts varying 
across iln' clusters, ami <i i II'- r< im-'^s (unknown cniis! anis <ir fiiurt inns I a 1 nun k I In- groups: 

jhfl- <'.,*• + l >)k-l-'jk + :"<jt • (-'i'-'l 

wlurc 1; ( /.,.. />.;.) ~ A'{( .-!* , Hi- 1, JSj.| and ...(. - A'f 0. a'; 1 ,. ) . independently. I'suallv. submodels uf (.12) 
< 1 c -t 1 1 1* -< j by constraints, such as S t = JJ, fco- = /»(■ ■ and the like, are considered. Realistic submodels often 
slill contain a large number of parameters, one for each group, so as In relied siibslaiil ial I tel ween- group 
dilferences. I < 1 c< insider I lie population mean. se| h,,. Hi) in (>i2). 

for 1 1 it> case of constant weights. ir,,»- ~ I. lining such models, say. by maximum likelihood (ML), is 
carried out liy ileral ive procedures the complexity of which depends esseui ially 011 1 lie nuinher of est iiuated 
parameters, for unequal weights the crossproduct siaiisties required for Ml. are replaced hy iheir weighted 
u-rsious I ni erpreial ion of I he est i males, as well as of I he model parameters, is proldemnt if because 1 hoy 
have 1 o hi' coi nl >iued (o obtain (plant ii ies I hai relate lo the target populal ion. A simple a da pi at ion due to 
I larville ( 1 '.I? 1 ) adjusi s for I he bias of t he maxi mm 11 likelihood est i mat or of a variance due 1 o ignoring the 
regression parameters (or (be population mean). This method is called the res) rinnl ntaxiniutii likelihood 
(HI Ml. 1. 

Iln- principal disadvanl ane of l he model in (IS'-') is 1 hat no ilisi in el ion is drawn bei ween sampling errors 
and iniperfeci desrripl ion of 1 he associnl ii ni in (he I arget populat ion. On I he ol her hand . the ui< id el- based 
procedu res appear not 1 o cater for sepa rnl e com poiieiil s ol van at ion sleiiiimiiR Iron 1 1 he 1 I list ereil 1 ial 11 re o| 
1 he 1 aruei populal ion. 1 his delic iency can . in principle, be resohed by defining mop- complex populal ion 
summaries such as measures of be] weou-cliMer varialioii. 

4 Simulations 

lhe 1 mi- 1 >• ise of t he si inula* ion si ndy described iii 1 his seel ion is lo compare I In- proper 1 ies of 1 he jaekkuile 
and (lie proposed model-based esl iiualors. lhe mean squared errors of 1 1 to estimators of the (sub- )- 
populat it ni means are 1 if principal interest . In summary. 1 be tnodel-based est 11 nan >rs are much 11 it 'to elficienl . 
in terms < if mean squared errors. 1 ban jaekkuile. and t he dilferences among 1 he model-based esl imalors are 
rid a 1 ively niiimpnri a lit . We note, however, that t he comparison is soniewhai unfair lot be jack-knife since 
lhe dala are simulated according to lhe model on which lhe all en nil ive met hods are based. 1 11 par! hull nr. 
weight adjiM ni' iii is iftuori d in (be -mini la! tons. 

\Vc consider a dalasei. such as lhe set of all students in lhe New .lerse\ sample, with their sampling 
weighls. clusieruiu. siriiciuie. and st rat ilirat iou/.gnnipiug. and n place lhe outcome variable by a set of 
values neiieraled iimii-4 lhe ino,|e| in ( |D). wiih realistic values of the parameters {rr^ ,,.} . cr lt . and {/o }. 

'I'h'is, lhe uroup- means {/o| are drawn independently from A ('i'ltf. ilf,,d. and the wit Inn-cluster siau- 
dard deviations {tr M , ; . } are drawn independently from //( 1 /. . \ ;/ |. l he sols 1 if »ronp means and lhe 
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Comparison 1 



Comparison ,? 



Comparison 3 




2 4 6 

Sirr.uiaied (Og-vanance 

Comparison 4 




0 2 4 6 

S<mu ; aied logva'iance 




0 2 4 6 

Simulated log-vanance 

Comparison 5 



0 2 4 6 

Simulated !og-va"ancc 




0 2 4 6 

Simulated log-vanance 



Figure 6: ( 'oinparison of thr estimated and simulated vvilhin-cluster log-variances. 

Nolo: 'Comparison' k. k = 1 5. is the plot of t ho ordered values of t'.ie logarithms of the simulated within- 

ehister variances against the ordered values of t he logarithms of the estimated withiii-clnster variances for the fctll 
set of plausible values. The five simulations are mutually independent. 

within-clns'er variances arc romnioti to all datasets within a set of simulations, hut different student- and 
cluster-level deviations are drawn independently for each simulated dataset. 

Km' example, for the entire sample = 10, 1',. = 10, Y„ = -10. and a]; = 100 generate datasets 

with features similar to those of t he survey dataset. Ignoring the within-cluster variation of the sampling 
weights, the estimates of the within-cluster variances {fff Vjk } have the a'{ v /i .-mull iple of the \- distribution 
with n,\jk — 1 degrees of freedom. 

The plots in Figure fj compare the estimates {frf,-.;*} for the five sets of plausiUe values for (he Now 
.IrT-ey dataset with five mulually independent sets of simulated est iniates of the within-cluster variances, 
drawn as realizations of the distributions n'( V]h x \r, A , t _ ,/(",t.j(- - I), when- (he variances or* are 
drawn from ';/ i . Tile empirical distributions of the estimated and simulated variances appear to 

have comparable features. 

For a set of generated proficiency scores I li" jack knife method, with the replicate weights from the 
survey, and the model-based methods wore applie I. The following estimators were evaluated. 

• t Ik- weighted mean: 
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Table 7: Suumiary of simulation of model-based estimators. 



Estimator 


Minimum 


Mean 


Median 


Maximum 


St. dev. 


Deg. fr. 


Wtd mean 
Jack, mean 


248.144 
218. 1-1-1 


250.273 
250.275 


250.292 
250.285 


252.672 
252.655 


1.197 
1.200 




Jack. var. 


0.720 


1.351 


1 .328 


2.251 


0.340 


31 .0 


( ' J vn r 

<*. J Veil 

CI ay 


0.037 
32.231 


1,100 
100.285 


1.372 
97.044 


2.265 
180.142 


0.320 
29.092 


38.7 


CI I var 
CI1 er| 


(1637 

32.231 


1,102 
99.860 


1 .305 
90.157 


2.257 
179.413 


0.310 
29.003 


38.6 


CI1I var 
C1I1 erf, 


0.633 

30.043 


1,112 
100.815 


1.390 
99.150 


2.367 
189.343 


0.324 
29.4-15 


38.0 


CH var 
CR 4 


0,63 1 
20.820 


1 ,103 
100.031 


1 . .5 ( 8 

97.550 


z . zo / 
179.392 


0.319 
29.005 


38.7 


CI 00 var 
CI 00 rrjj 


0.030 
30.550 


1.-118 
101.333 


1 .392 
98.725 


2.288 
182.21 1 


0.317 
28.808 


■10.1 


H KM L var 
HKML n- H 


0.031 
20.830 


1,103 
100.032 


1.378 
97.550 


2.257 
179.393 


0.319 
29.005 


38.7 


rrj-'j. mean 


000.000 


007 077 


007. 041 


712.931 


21.501 





W The estimators of the sampling variance of the we.ghted mean are denoted by the method (for ...stance. R 
a,,,, „., sv.nbol 'var' in the FirM cohnnn. The estimators of the he. ween-cluster variance are denoted by the method 
and the v,mhol a),, The estimator of the common wit hi„-cl„ster variance is given u, the ,ast row of the table (for 
all ln( ,hods e arat, wit hin-eluster variances are estimated for each cluster). The group means, common to the 
of M„,„la,„ms. wen- general from A'CJoO. 10). The within-clustor deviations were generated fro,,, centered 
normal distributions with standard deviations drawn from tf(lMO). the be, ween-clnster dcv.aUons were genera cd 
from V[U HI) Ml the random draws were mutually independent . IV standard dev.at.ons were common to the 
simulations, bu, the (random) deviates were drawn independently for each replicate. One hundred rephca.es were 
simulated. 



• t he jackknife mean: 

• the wit hill-fluster variances-. 

. the |„, ween-cluster estimates: . erf, „ , rr?. «r?, H . calculated a. with weights based 

on o h - 100. and X!L . an iteration of the wetghtcd Fisher scoring algorithm described in Seeltou 
3.9: 

. ,|,e r-st, mated sampling varumrcs of lh- estimators of the hei wen-cluster variant. 

laid. 7 contain, a .mutuary ..f a s,-1 of simulations for the population mean. For each estimator (a 
r„w of the table), the minimum, mean, median, and maximum realized value are given, as well as the 
standard deviation of the reaped values. The quant dies in the extreme right column (degrees of freedom) 
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ar, discuss,,! liclow. 

■IV w,igh„d ,„,„,, ( r;l ,i„) and , I,, jarkknif, .s,,,,,,,,, of,!,, population n,an arc almos, id,„,ienl 
<W»t him. («>...,. ,!„ sup,rpop,,la,,on m ,an of 2.V.,) is ,f„, ,o ,h, uncr.aintv assoc,a„d with 
II.- snnulat,,! 8 ronp-|,vH ,l,vin,,ons whirl, an- cns.an, across ,|„ smndaUons. Th, true varianc, of ,h, 
wn,h,od „„an ,,,„„«,<,• can h, calcula,,,! l,v suhs, i„„ u„ , h, s ,n,ra,,d vain,, of ,l„ variance j„ 
"'«■ (IK): Hs vain,, , ,»». ,| iaI „,„ Ja(kkm , ,. slmiaI( , (|f „,,. sa „ ||)|mg ^ 

".«I.a„ !.;«) has a n,ga„v, h,as. whil, n,od,I-has,d ,s,in,a,ors ha,-, posinv, hiascs IIow,v,r 

""" rih "— " r - Hi- ".HU. .^uan-d errors of ..„ ,s,inn»i,d vannnc an- n, fi ligih!, Also 

<I>" l--s ,u ^in.a.iun of,|„ I,,., Kmi ., hlHi variflmv (fT?j = ,„„ ^ ^..^ m ^ ( , f ^ 

varianc, of ,h, estimator. 

' '"' l,1 "' l " | - |> - <1 »«-"l.cuUCl. (11.(11! and fit. were in.roducd ahov,. Th. ntHhod denoted as C i()0 
»» ^ - - analog,,,, ,o CR. |, 1It „,«. ,,.„| lls , ljf nl „ ,,,,,„,,„„, ^ ^ = , ()() ^ ^ ^ 

w^h(s, ; , would l,M ) p l i l nalif>^,u-,,v ( , lll al,o.l,,ir,s 1 nna,,s. and aj, = Kit,. Further, HK.\f L stands 

r.»r lh, w, ift h,„d v,rs,„„ of II,,. restrict .nxininn, likelihood nuthod ,l,scrih„l i„ S,c,,on 3.9. Sine, a 

■V™! Martin, solution is „m d I,-,,,,, „,,,l,od ( "H ) only o„, ,„,a,i ( ,„ of ,h, Fish, r scoring nh-onthn, is 
»1M'I..-«I: f.Tli.niuary ,xplora, ion shov.„l tha, fur, I,., „„a.,ons ,l,an K , ,l„ vain, ofth, ^jmntr oT «r? ( hv 
h'ss i han I). I. " • 

All lh, estimators of ,h, sampling varianc, of «h, Kn „ hu ,\ „„,,„ app,ar „, h, downward l„as,,| ,|,, 

' lh — s.inia.,,,- of, h, ,„,,,,,,. , :l 7*= i ,2s. Th«-,s li „ l a 1 nr(Mnn„,arlv max lu s this 

vnlu, t „s ol,s,rv,d „,, al i is LdhS), and ,1„ o,h, r ,„od,Fhas„l ,,, imaiors ar, onlv mar.inailv n,or, h,as,d 
Il-jn-klauf, ,siuna,orof th, sampling w.rianc, has hv far ,h, Ian,,, |„as. Furl l,,rmor,. th, ohs,,v,d 
standard d< \ ial inn of I h, jackknif, ,st inial or is sh»]n \\ higher, han ns tnndef-hased counterparts so that 

" S M,lla,! ' ! " rri "' " nW > "'" hi ^' T'i- .■-n.pari..,, of,h.. stardard error, roots of,),, 

s.tniphnt, variances) |,-ad I,, lh,- sain, ronrhisit.ns. 

Hn.s. ,h,r, ,s h,U, u, ,ho ( ,s, h,,w,,„ ,|„ „,,d,.hhas,d ,s,i,nn.ovs of ,h, .,„„„,„ ,,,,„,„, p , Tlla|(S 
-MMh,^,,,,,,,,, of ,h, ,,,,„.„„•( ■In(luh,,-hassn,n,skn.,wnh,,w„n- ( |us„r varionc.lt is encouraging 
thon K h, ,I, tl no, knowing , his varian,, ,ans,s only .nar.mal loss of ,[Iin,nry, "J h, nnulH-hased ,s ; „na,ors 
!•««■ ah,,,,, ,d,n,„al d,s, nhn, „ „,s and , h,y ar, ako v,ry hi,l,K ,o r ,,ia„,|. .|„ addi.ional nnnpn.a.io,, 
...vulwd ,„ lh, mnhod c H fl ,ak,s only a margmal con.nhn.ion ,„ t|„ ,flin,nry. 

"-,v, r . lor .. v ,ral sinnda. ,ons « h, ddl,r,,n,, h„w„n th, ,sti,„a„s ar, ,onsid,rahl,. as can h, s„„ 
- 'I„ pa,rw,s, phns of ,|„ s„s of ,,,,„.„, ,n Ku,nr, 7. Th, n.H.ods and ( ' lf)f) ar, no, r,pr,s,n„d 
... Hi, [dots so as lo acl,i, v , high,,. r,solu,ioi, and rlarily. 

' U "<»""M-M)>*\>^^ Iln.^inia.ors 
l! ' ' "'" ;IS "' 1 - nUl ' U ^ h lh «' lr '^nl,,,tio„s ar, so„„wha, sk,w,,I. No,, ,|„ s„hs,an,ia! unc,r,ain,v 
m U,,r .,.u„a,,on: ,,„.,,. nIwfW s, andard d,via,io„s ar, around 29. I, is snrpr,s,„« ,ha, ,h, VWU 

'•«lniialorhas,h, lames, Idas: h.,wv,r. i,s ..Lm-j-w,! ( |,,;, llnll / A , n , •, 

i an. .1, \ i, u inn ( J.s.s 1) and i,s m,an squaivd ,rror 

C2s.!^ 1 ) an- , h, Miialh-si . 

An „u„„iv,!y app,alin, way of c„,„par„„ ,i„ ,||i,,„„ y „f „„. , s , ,„,a,o,, of , l„ sampling v 
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figure 7: Pairwiso plots of the simulated estimates of sampling variance usiiiR jar kknifo and methods CI. 

en. cm, and kkml. 
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I»y fitting \- dist ribulious In the empirical distributions of the estimators. Then more degree- of freedom 
'il the fitted \- dislri'huiion imply higher efficiency. The method of moments estimator of t If' decrees «,f 
freedom is given by the equation 



p. 2 x (mean estimate of variance) 1 -' 
d.l. = — 



r , (T<) 

variance of the m.s.e. 



Ihese degrees of freedom are given in the extreme right milium of Table 7. In evnliKiUon of this 
estimator we ignore the bias of the variance estimator. Not knowing rrj ; is associated with Un* of up to V 
degrees of freedom, and using t he jackknife method with an additional loss of (>..") degrees of fr'vdnin. 

'I lie model-based estimators are mutually highly correlated: the correlations among the est ifliiiunf. ( 'II, 
C'lfl. CTi. CKJO. and RE.YIL are 0.SI90 or higher. the correlations of these estimators with the e«t iiunmr ( '| 
is U.'IKI or higher. The correlations of t he jackknife estimator with the model-based est i mat < .in ;>r<- (I. SOI) 
or higher; the highest correlation is with estimator CI (().8(>o). 

I he set of 10(1 simulations was repeated with the same simulation parameters using a flit"f< -fnU sample 
of simulated group deviations. Although substantially different values of the observed means fiti'i standard 
deviations were obtained, their pattern, and conclusions about efficiency, were the same. For i'trlnnce, in 
one case the jackknife estimator of the sampling variance of the weighted mean had the loan' bias, but 
in ail cases it was associated with, 20 ■ 3(1 per cent loss of efficiency vis-a-vis either of t lie l|i< .del- based 
est ii nators. 

In conclusion, the jackknife method does not deliver on its promise of unbiased eslittl ;\Ho)\ of the 
sampling variance, and the efficiency of this estimator is also inferior to its model-based count <Tl>;<H,s. The 
modd-bas.-d methods do not involve any appreciable loss of efficiency due to unknown he! wvi, ehiMer 
varianre. There appears to be little payofr for the set of .IC replicate weights required for t h<< jackknife 
estimation, which are dispensed with in the model-based methods. 

(i< neralizability of these conclusions was explored by further simulations using different |i;i r nui<-ter 
values and different dalasets. The jackknife estimator of t he sampling variance is particuhtrl.v Vii|| ItT; ,hle 
when the bet ween-cl lister variance a) } is large; the loss of efficiency for rrj' ( = 1 :,(! is about ;')<) percent, 
while for rr-jj = o() it is around 20 per cent. The choice of the distribution for the between-groii|, differences 
does not appear to affect the properties of the estimators, 'fin- estimators of rry have negligible bi;u-, and. 
agreeing with intuition, have sampling variances increasing with tr'j t . 

4.1 Estimators for subpopulations 

Of particular importance are the relative performances of the studied estimators for smaller tuples „ r 
sill .population-. We describe in details the simulations for the subpopulat ion of Hispanic student - (response 
:i to item 2S) in the New .Jersey sample. There are .'{(>:! Hispanic students in the sample; they arc totaled in 
M -dnsiers within .",2 groups in the sample. The distribution of Hie Hispanic students across the deters is 
compactly summarized in fable S. A large number ofcluslors contain only one to three Hispanic dmlruts, 
while iii a |rw clusters Hispanic students form a ma|ori!v. 
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Table 8: Distribution of Hispanic students in the New Jersey sample 







St ml 


■uts within clusters 






Students 1 


2 :s 


1 5 () 


7 8 10 11 


12 


:', 11 1(5 17 22 


('lusters 27 


Hi id : 


i 7 5 


:i 1 12 


2 


2 112 1 


Note: The second 


row contains 1 lie 


numbers of 


clusters which have the 


numbers 


( jf students Riven in the same 



column. For instance. '27 clusters have one Hispanic student each. 

Several values of t he model-based estimators of rr j ( are negative (up to 12 per 100 simulated values), 
and. for the purpose- of calculating the weights iiji-. they are truncated to zero throughout. The results of 
the simulations are given in Table !) in the same format as Table 7. 

The observed variance of the weighted means is equal to 5.5! . The estimators C Mi and HK.ML. come 
closest to matching this value, followed ivy (TOO which assumes known variance <t/ ( equal to lite corre- 
sponding si m ii I al ion para meter. The price- for unbiasedness is very high, t hough: bol b ( Mi and H ti.ML have 
much higher mean squared errors than t he jack knife (2.20). or t lu- ot her model-based est imators ( 1 .88 for 
(MOO, 1.05 for (.'II. 2.00 for CI, and 2.12 for (Mil). MMie degrees of freedom loosely rellort tin- efficiency of 
tin- estimators, although upward biased estimators appear in a somewhat netter light . flic difference of 0.7 
degrees of freedom between t he estimators (MOO ami (Ml. thai can be at*, ri billed to information about aj, . 
appears to be trivial in comparison with the differences among the least biased and jackkuife estimators 
on one hand, and the estimators CI. (MI. and CHI on tin- other hand. 

MMk- loss of efficiency due to not knowing crj ( is quite moderate-, even though est imal ion of <tj ( is 
associated with a hit of uncertainty. However, if an incorrect value- of <tj, is assumed for (In- estimator 
('100. a substantia! loss of efficiency is incurred, for example-, assuming that crj f — ISO wlnm in fact 
cr'^ = 100 yie-lels a severely biase-el estimator with mean squared ermr comparable tei t ho jackkuife. In small 
siibsample-s the sampling variance of tin- weighted menu is nmre- strongly inlluenee-d by ctj t than in the- 
e-ulire- sample, for example, the sampling variance e>f the- weighted mean simulate-d using crj ( — 70 is about 
one- half of I lie sampling variance siniulal etl using = 150. 

In all elatasets the' assumption ofeepjal wit bin-cluster variances (ff^) is associate-d with substantial b>ss 
of efficiency. 

MMiese observed properties were confirmed m simulations based em several e>the-r subsainples wit It size-s 
l:i() 500. In general, the jackkuife- estimator is biased ami its sampling variance- is larger than that of 
t he- model-based e-st imators, with occasional exce-pt ion of ( Mi . I he e-st i mat or (Mil is the- most e-IIicie-nt unc 
for some- subsamph's ( following ( MOO), but performs rat her poorly for ot hers, t hough never weirse- t ban t he- 
jack knife. M be performance of t he- e-st imators ( M and (MI is much more- ce insist en t ; CI I is uniformly mon- 
efficient than CI. but the- difference is unimportant in comparison with the- improvement tlu'se- est imat e>rs 
represent ove-r tin' othe-r methods. MMie jackkuife is least competitive for the smallest datasets (losse-s in 
efficiency of up tei 15 per ce-nt ) ami. ironically, for the entire sample'. Inefficiency eif t In- jackkuife- feir large-r 



Table Si: Summary of simulation of mode] -based estimators for Hispanic students. 



Estimator 


Minimum 


Mean 


Median 


Maximum 


St. dev. 


Dog. fr. 


Wtd mean 


213.00 


218.75 


218. 78 


255.99 


2 35 

L. .t)'J 




.Jack, mean 


212.9-1 


2-18.76 


2-18.78 


255.98 


2.31 




Jack. var. 


1.86 


■1.92 


■1.58 


13.26 


2.11 


10.9 


CI var 


2.19 


■1.89 


•1.62 


1 1 'Ul 


1 .JO 


1 *> ^ 

1 z,o 


CI 4 


0 


112.59 


97.31 


381.13 


80.48 




Cfl var 


1 .88 


5.33 


4 90 




1 (1 1 


10.1 


CH *l 


0 


89 .17 


68.11 


383. 11 


77.95 




CIII var 


2. 10 


5.73 


5. 31 


1 1 .20 


') I n 


1 n 
1 •) .U 


cm a 2 H 


0 


127.78 


111.45 


383.01 


85.44 




CH var 


1.93 


5. -12 


•1.78 


16.61 


2.84 


7.3 


CH al 


0 


1 1-1.91 


96.78 


■107.29 


97.20 




CI 00 var 


3.73 


5.23 


■1.90 


12.69 


1 .86 


15.8 


CI 00 cr% 


71.18 


109.90 


99.20 


■130.66 


84.61 




\\ KM L var 


1.93 


5. -12 


■1.78 


16.63 


2.83 


7.3 


H KM L a- n 


0 


11-1.90 


96.87 


438.36 


97.24 




(T( v mean 


■113.20 


688.69 


669.80 


1049.21 


K16. 12 




Xoles: flic same notation and layout 
simulation parameters given in Table 7 


is used as in 


Tabic 7. Thre< 


Iitindred replica! 


•s were siiim 


lated. using the 



samples may lie duo to not using wilhin-clustor information. 

5 Smoothing techniques 

'I he model-based method of est i mat ion of the standard error of the mean proficiency for a subpopulat ion 
involves estimation of the cluster- and student-level variance components rr-, and {(r? Vjk }. For small 
subsamplos. especially those with only a few strata represented by more than one large cluster, the estimates 
of these variances have large sampling variances. Clearly, estimation of these variances is the Achilles heel 
of the model-based methods; it is exceedingly inefficient when a large number of subsamples is analyzed 
because information about the variances contained in the analyzed subsample could be complemented by 
t In- ot her subsamplos. 

Although it is not reasonable to assume that all the subsamples have the same bet ween-cluster variance 
ffjj , suitably selected sets of subsamples may share a common variance a' u . Then estimation of a\ can 
be strengthened by averaging the estimates of a'' rj across the subsamples. In this process of averaging 
more weight can be given to larger subsamples. Also, the weights may vary depending on the analyzed 
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subsample. Hellenics iudI ivated by shrinkage est inial ion or empirical Hayes methods may he particularly 
useful. In a typical such scheme each variance for a subsample is estimated as the weighted mean of the 
variances for the entire set of suhsainples. with the weights associated with each subsample held constant, 
except for the analyzed subsample which is given more weight. 

Care has to be exercised in averaging the estimated wit Inn-cluster variances for a cluster because 
siibsamples may have substantially different wii hin-clusler variances. Each cluster within a subsample can 
be considered as an informatively selected subsample. The change in the within-cluster variances is greater 
the more closely the variable on which the selection is based is associated with proficiency. In any case, 
estimation of the within-cluster variance is simple (based on the sum of squares r,\jk ). ami a variety of 
schemes of pooling informal ion across siibsamples and/or clusters can be devised. 

The influence of the sampling errors associated with estimation of the variance components can be 
flirt her reduced by the following scheme-. The sampling variance of the estimator of a subpopula' ion mean 
is a linear function of the variance components: 

varlji.,] - S, -icr], 4- ^.Si.ijA-TiVjt • ( : ' r1 ) 
when* S'., -j and >', ,,{. are functions of the sampling weights; 

I U'r 



1 he subscript </ = 1 1 is add'd to several quantities in CM) and throughout this section to emphasize 

their dependence mi the analy/ed subsample idalaset). for the estimated sampling variances varf/e,) for 
siibpopulat ions <i we ronsider t he regression eipiation 

var(/i„) = S'aMfrj, + ^.Suifr^M-j* + -» • O'" 1 ) 
J* 

where :"„. </ = I \, are siibpopulat ion-sperilic random terms and itj, and {frf v k \ are sots of variances 

eotimioii across th<' siibpopulat ions. The term consists of two components: the error of estimation. 
var[/r„) - \ar(//'l. and the model deviation varfjf) - S^n'* „ - ^<>-ijt a 'w.jk ■ 1 '"' 'regression' parameters 
rrj, and ir'jf,- , t . may be assumed known or unknown. In the latter case they can be estimated by standard 

I'l 'gl'essjoll lllel hi ids. 

Thus, (liven a set of eslimates of the s;ini|)|ing variances var(//„). and assiiiiung common variance 
components across the siibsamples \ti\. these estimates can be 'improved', or smtHitbed. by fitting the 
linear mod- I (',',:>), ami >]•■< hirina, the (ill. d values 

var(/J.) = >'.. + ,S " b't^H'.ji- • 
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where t In- estimates of the variance components are obtained by a weighted leas! squares fit wi.li weights 
reflecting differential precision of the estimators var(// (1 ). Since there are a large number of clusters it is 
expedient to substitute the estimates a^- jk for the corresponding variances in thus obtaining:! simple 
regression on a], with no intercept. To take account of differential precision of the estimated variances 
var(//) (estimated, say. by jackknife) suitable regression weights, such as the sample size, cluster sample 
size, or their linear combination can be applied. Stability of the estimate of the common bet ween-rluster 
variance n\ t can be explored by varying (perturbing) the regression weights. An important diagnostic 
check for appropriateness of tin- model in (31) is that the regression intercept . if estimated, is close to zero. 
I he smoothed estimate , if the sampling variance is the fitted value (36). where the variances are either 
estimated exclusively |>o m the subsample to which they refer, by a regression {common to all subsamples). 
or as a cmipr. ,iise of t liese two approaches (such as an empirical Raves approach). 

Kxtensioiis and M'Veral adaptations of this approach are easy to devise, for example, by introducing 
dillerent variances 0-7, for disjoint subsets of subsamples. The regression equation in (:i 1) can be.-,up|einenled 
by other data summaries (not only functions of the sampling weights), as well as by an intercept term, thus 
obtaining a better fit. although 1 he original interpretation in terms of a common bet ween-clusler variance 
would uo longer apply. 

An important concern pertains to normality and homogeneity of the 'error' terms : -„. Instead of the 
retires- inn of ;,, = van//) - V, , ; .e~ on >', we may consider the regression 

iJS,, , = n~ n + ;•• , (H7) 

m which the assumption of i.i.d. for f," = -:.,/S.,.-> may be more palatable. Now the coininou variance <tj ( 
is esiunated as the mean of the quantities ;„/>'.,.■.< . As an alternative, a suitable I rnusfnruial inn of can 
be applied; in particular, if inin(;., ) is positive, log-t raiisformation leads to the gttumtrit average of z a as 
an estimator of rrj, . A 1 raiisformntii >n may also be applied to ; u /.s',, •) . 

Ol course, the method outlined in this section can be applied to another model for the outcop...-. which 
would lead to a relationship between sampling variance and a different set of summaries of the dalasel, 
I he method requires an estimator of the sampling variances. The jackknife or a model-based essimatnr 
can be used, and it is then improved by ib.e smoothing. 

An important issue 111 application of these methods is identification of subsets with equal (homogeneous) 
within- and bet ween-, luster variances. Detailed understanding of the educational, soeiai. behavioural, and 
economic processes relevant to Hie target population may promote- an intelligent choice in this myriad 
of smoothing schemes. An interesting option is that of combining the estimates of the variances based 
oil a subsample with the estimates of variances (their means) from tin- other subsamples. The mixing 
proportion-; should depend mi the (. If. ct i\e) sample size(s) of the analyzed subsample. Such a scheme, 
motivated by empirical Hayes m < 1 1 |( „U, | lilt , ;l « x ,. tl \ potential but requires careful experimentation and 
liiio-t lining which are hevond 1 he -ci >pe nf j hi- pro pet . 
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6 Regression with survey data 

Ordinary regression provides an easy to interpret assessment of association of one variable (the ratpowte) 
with a set of ot Iter [crplanalory) variables. The standard least squares method for estimation of this re- 
gression function is applicable when the following assumptions are satisfied: the values of the explanatory 
variables are under control of the experimenter/analyst, the outcomes arc generated by a process which 
assigns values of the response conditionally independent and normally distributed with constant (condi- 
tional) variance given the relationship to the explanatory variables (regression function), and the regression 
function is linear in the vplanatory variables. In general, the assumptions of conditional independence 
and of non-informative si .> ction of subjects are crucial for validity cf the inference based on the regression 
est imator. 

In regression analysis using survey data there are two distinct challenges: delinition of the estimated 
quantity, and taking account of the features of the sampling design . For conceptual clarity we consider first 
the hypothetical situation in which the values of the response V and of the (single) explanatory variable 

.V. denoted respectively by V, anil A',-, i = 1 A', ' " ~ available for the entire population. With such 

data we would calculate 1 the regression slope as 

_ E,(A,-A-m .v-'E,-v^--yy 

E,(-V- -at- ■v- 1 E,A'r--v- ' 

where X - A' _1 E,.V, and V = A' -1 V), V, are the respective population means of A and V; all the 

summations are over / = 1 V. Since our inference is conditional on the target population, we regard 

.1 in (38) as an unknown constant. 

As an alternative, we may consider a construct (latent) variable Y* . observed or measured indirectly 
and subject to random deviation (error) by the variable V. and suppose that V is linearly related to A; 

y; = n " + . 

If the values of A, and V, were available for the entire population .i would, under certain standard assump- 
tions, be a "good' estimator of A" . 

The (residual) variance of t he deviations V, - Y~ is eM imated as 



Y^lYi-n-JXif, (39) 



A" - 2 

where o = )' - XA. We emphasize that the variance <r~ m (3<)) is a constant or an estimator, depending 
on the adopted perspective. The variance of A. as an estimator of A' . is re;/ {^,(A'< - A)-'}, and would 
itself he estimated bv 



(4(1) 



We consider estimation of the quantities (li.s) (-10) based on a stratified clustered sample from the 
target population. Our appf >ach is based on estimation of t he population im-ans A'" ' J2, A, V, . A ~ 1 J2t -V" 
A' -1 y, 2 ■ A' and V. fir which methods discussed in Seetion Tl arc- applicable. 
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As ilc i stimator of tlu- popnlat ion Miiniiiarj A'"" 1 V \, ) ; w us,- tin- >t.-if isi i 



wh-r- 111. MUlimmiui!-. arc over all th.- -uinpl,,! ,|,i nml ary units |*t udoilt > i in duster / in group h). In 

ord-r tu iind tit- [approximate) < I isi ri lui I i, ,n of 1 1 stitnaior> of ( .!S| ,.pn has,.,! ,,„ a random sample 

uv r.-ennr- I In- •■ovariamv -inn'iuiv of tli- cm imaiors of population m-ali-. Tin- -talHti.- 77/ i- th,- 
oiiiiiaioi- [12) applied to the product .V V . Its sampling moan and varinn.v aiv .). riv< d in Section .1. 1, 
i or tlii' proilurl A" V we consider tin- ist 'nator 

X ii : — : : ^ * ■ 

its expectat ion is 

El.;//) - covi jr. i/l - Ei joEi >;i. 
1 It- '..variance ran |.e <a press, ■, j h, t, -i i i s- of \ariances: 

I , 

co\ij-..!/) = ■- { van A -r ) j _ \, U i .V ) - \ar( V i [ ( pj) 

\o(, thai rii i- an unhiaM'd ,m iiuaior of XY mih ivji. n r and // aiv uucnvlated | | H s is uti h k- l\. tlnniuli. 
wl " 11 V y ■"- associaied. The |,ias i-. small ulh ii ' ,o\,., v i . ,s mudi small, r lhaii X \ 
I Itf -atnplinu variance of rii is 

vari.r//i cu if 1 - Ei, ; r iRi //'-' i - [ Ei .17/1 }'- o|;!i 
Lvaliiatioij ofthe covarian<-«- eo\ (.,•-. ,fi [„ j M ^ n. ral i>. .1 siramln forward. If tin- a^ntupt jons of nor- 
" l;,lii . v ,,fl1 "' !m * ■'■ /' ; " v a«h»l-.I. ili.'h ilur. covarianc- i- a funciion of means, the covarian.v. and 
ih" variant of ,■ and t, The d-rivai iou. i>ivn h. low. ii„ properti,. - -I 1 1 . < conditional dw rfhul ion,, 
mid-r noniiality assuinpt ions. 

i ii- roitdit ioiial disi ri Inn ion of ,r nivti 11 is 

A i 11 j -4- -—•(;/ - ii,, 1. (rl - I . 

wli-r- ft and rr . with ill,, appropriate siil.s.aiptl-sl. stand l< -t tl,o ,a P , ,-, : ,ii., II s and f-o oariancs of ,r 
and ;/. N, i\v 

T . ■ , , ■■ ' a. .t- 

E(.r- ; .,) -.- rr; _ — r 4 , <: ■>/,.■■-: Ul - fl i - ---;-(„ - r , 

.xp.-.lati \.r//aiid a--.su, ll I lull 11 is unnnalh di-iril-uj^d. wv ..J.lain 

E I Jr'' it' - ' ! - E, \ if Els' t/\\ 

'• l,T : /'•■ ! ' *■ /'" . I - '..V. . J rr. • //.//, I , | | , 

IH 
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I lie •• itluT term*; in i In- in t:tt « *r 

Y .,)■ »•*■■'" , 1 1 . >h 1 1- - ll'-'.'/ 

■ - -- - : — — • „. - nr.) 

ill' is 1 1 tt- -fin tplf 1 1 'I .'i I < 1 1 sai 1 1 1 >li ii.u, w»'i«lil s j i an 1 >< ■ il •! ante i I In nn I In- sampling 1 1 leant* ami variances uf j\ 
if. and I heir ri >varia ■ i • - » - 

l ln- < • x j K'c i at ions a nd v aria ii<'i>> i it" I In- iliniti ralur a nd dian ai lilinli U' in ( la I are fun i id I n I In • ft illuivi l is, 
I in l hod. I.i i .;• , . . am I j-,. I ><• 1 1 1 ri')- ii' >n il ally dKl ri lulled rain Iain variables with respecl ive i lira lis . 
a nd /i ■ . and i i u- haria ih < > 1! (mi I wi a j>t s ri . h. and r i . I In n 

\aii.r. - ./ ,./■ ) = E[(r .- .n,j ,)-} - {E(.i. )}'-' 

- (//• - Hf. //:'/'"' - '-'/'•!,"•/'•• - *-'/' ■ + *-'/"•/' J 
= >; J; - 5:«iv.. >;/.. + ,,•>:.. + /,;-'>:,,, 

I li.- equal inn tar i In- di in Milium i .r i-. ■ >l<i.iin« >1 In >m f ha by s> -i i in» ri .- ( N _e - >J >, — >j . >J ■ - M., .. 
and m --- // ). 

Uiri.r, - .r;*i .-_ >; . , - 'JUr, -'- \ji;Y,n - !/n.l!,,i . 

I'inal I y. in d' I 'in ii lie 1 lie ii 1 1 <i ii'iiis i if i he rat i' i in I la > we require i he p >va r i a nee 1 1 f I lie numerator and 
deiieiu iinat.it'. An exact 1 1 ti l 1 1< 'd a) 1 1 iear> In p i|iiin eniisiderahli' t lii > ri Mine 1 1 n >n u n I > of various in ai-li mar 
ft i ii. i inn?, id ill*' data are required On tin- . vi her hand, espi cially if J is |" 'sit i\ e, I he n irrelal \< >n of I lie 
mill 1 1 - 1 a I ( ■!■ and ihiii iini nan >r i» likely in In vry lii»!t. and i lie di-*i ribui ion > if I h<- nil ii > enii he est in ia led by 
ii ii| ml i i ni tine i ir >e\ i ral v ali-a ic vatm » ■ 'I i Ii j~ ci im dai n m. 

\V. wnti' i he e»i inuit ■ 'i' t !■"> i a> 

i/ 1 + *. i 

J = «- — (17) 

wlnre j/; m 11*1 i/. are roiiMauls. the i'e>[ ierl i v< • expectations n( thi' ii u 1 1 1 it a 1 1 >r and denominator in (la), 
and *•; and an reiitefd random variables wtl h r< sped ive variances <?j- and rt~ and covniinnre ft j . I, el 
I'yj -.- f]-:'\ < T ;'' T -, he the correlation •>)'*.; and . Sii|i|hc-<- ! -. s , i i> inurh smaller than in, with hi«,h 
( j r< ilijliility i // ■ 1 "J i . 1 1 1 -■ 1 1 i-. c\, i- smaller | linn in. Pn h = l.'J. I lien ("r»»m llu- < \|iansi(iii 
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[ignoring tlir higher-order terms), we have 



„ , l/i it i -» <rr,iii 

E(.i) * J- __>--. + -JLl 

!/_» if.-, !C.j 



var(J) % -4 + - 2cri 3 -4 

»o I/'. 1 , 



(•18) 



ERIC 



These equations ran he supplemented by further terms involving higher moments of the random variables 
"it and ")-.'■ The values of these moments are not determined by the variance matrix of t'nOa) unless 
normality is assumed. However, the assumptions of normality are palatable for large sample sizes. 

Note that unbiased ness of the numerator and denominator in (45), as estimators of their respective 
population counterparts, does not imply unbiasedness of A, not even when 0*12 = (J. 

6.1. Residual variance 

The residual variance a 1 can be estimated by the same approach. Ignoring the degrees of freedom in 
regression for the population. A'/(A* — 2) = 1. we have 

(AT -A'V) 2 

rr- = r- - r- - U== — (49) 

where the bar over variables A' ami )' ami their functions stands for corresponding population mean. 
The naive estimator of <r-' is constructed by replacing each population mean in (4!)) by its (jackknife or 
model-based) estimator. The mean square V 2 - is estimated without bias by )/- - //'-' - var(i/). The 
denominator of the fraction in (49) is estimated by J-'-' - P - vnr(i'). and the numerator by 

(iTl ~ .''!/)"' - var(?17 - .ry). 

fo obtain the expectation of this estimator we consider the expectal ions of the numerator ami denom- 
inator in (49): the former is 

E {(Ty - r;,) 2 } ~ var(7]/ — j'l/) + {J£(7T/ — J~}j}} , (50) 
and the latter is derived in complete analogy with the mean square for V. 

6.2 Implementation 

The est im at ion procedure described above requires estimation of t he sampling variance matrix of t he means 
of A', V, A'-'. A'V. and )"-'. This can be accomplished by jackknife. (he model-based, or, in principle, any- 
other method. Tin- extension of both for multivariate statistics is discussed in Section 'A 8. The estimated 
moments are then substituted for the 'true' moments in (46) applied for the numerator and denominator 
of the regression estimator the estimators of (39) and (40). The estimator J can be (approximately) 
corrected for bias using equation (48). 
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Null- that tin- method described in this section is a ] • [ • I i r ;i I >1« • for any sampling design since it is I mm* I 
mi the expectations and co\ aria t ice striicinre uf the population means of certain variables 

6.3 Regression with jackknife 

The jackknife method discussed in Section :{.:) has a si rninht fi .rward extension t<. ordinal} regression. 

In essence, we replace the mean as tile 'parent' method by the Weighted regression, and subject each 
component oi' the regression vector estimate (the intercept mid slope in the case of simple regression), as 
well as the residua! variance, to the jnrkknif*' estimation. 

1 litis, in NAKl 1 Trial State Assessment an ordinary regn ssioti is lilted for comhinati if each set o| 

replicate weights ( and the final sampling Weights) and plausible vahus. a total of ~u • "> = 'l^'i regressions. 
The sets of est (males ( ,|' the regression parameters and of the residual variances are 1 1 1 < n summarized to 

obtain I he jackknife estimates of t host- parameters. Although relai ively easy to inipl ut t his pn iced are 

demands a lot of com put ing time without necessarily providing an ellici.-nl est iiualor of tin- standard errors. 
A short-cut. using the jackknife as described above for t he first plausible value, and . i i n ai i tiu. tin- residual 
variance from the regressions with the final weights for 1 he m Iter four plausible \ a I in-, is u-.d ; n . >pi rat nm 
1 bis way only til regressions have to be fitted. 

6.4 Example 

l or illustration we <•< uist rnei a repressor variable as the total of all the non-missing responses to items 
•>'.'>\ 2:S* ( 'In M«lh rkiM how pfliu do ifuu ...'). I'.ach of these items is svoje.l on the t ,i k- - rt scale (1 
•">. ordinal). Data from 2171 students ( t su.l per cent of the sample) from Mil clusters, who responded to 
ea<h item are used in the analysis. The jackknife analysis, involving ">7 • •"> = weighted h ast squares 
regression fits, is sit Militarized in t lie top panel of Table 1(1. The jackknife est i mat es i .f 1 he iniercepi . slope, 
and i'esj,|u;d \ariance are given for each plausible value, and the estimates for tin- proficiency scores are 
given in I he right -IllOs! column. 

The results of t he modeb based regression met hod are given in t he bottom panel o| I able 1 (I. 1 hey are in 
close agreement wit h the jackknife method. Hie estimated correhuion of the numerator and denominator 
of J in (IS) is equal to 0.1(1. but even for imputed correlations of zero ami unity the restilis are not 

substantially different, d'able 11 displays the results f (ir the prnliciency scores, sm arizing the analyses 

for each plausible value. To explore the influence of t he imputed correlation, the results are givu lor 

correlations (I. H 'I 1 . The standard errors of the slope est i mat or are affected a ureal <\> al by the choice 

of t In- correlation, but for 1 he est i mates (int ercept . slope, and residual variance I i he choice . .1 t he cor relat ion 
is not critical. This suggests a simplification of the Taylor expansion met hod presented in Section li. 

6.5 Multivariate and multilevel regression 

I'll'- model-based method for simple regression can serve as an outline fa- extensions to multivariate and 
multilevel regression. In the former, the population regression parameter is delined as 
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Table |(): Reg,-, 


>*km ana 


lysis using i he 


jackknife and mod< 


d-hased nif'! 


tods; New . 


ersev data. 








Plausible value 






Prof. 


Parameter 


1 


2 


3 


■1 


3 


value 








Jaekknifo 








Ill!«*I"''<'|iI 


230.03 


230.02 


220.03 


2:i().:i;i 


22fl.fiij 


22!). 3-1 


Slope 

Res. variance 


1 . io:i 

(0.227) 
1 012. or, 


1,101 
(0.218) 
103 1.51 


1,11-1 
(0.218) 
10 15.88 


l .ma 

(0.22.-,) 
1013.02 


I - r ;21 
(0.233) 
1018.37 


1.420 
(0.233) 
1030,17 






Mudd-hasi'.d method 








III! ercept 


230.03 


229.02 


220.02 


230. 10 


220 ()1 


220 28 


Slope 

Res. variance 


; .107 

(0.224) 
10-10. 1*2 


1 ,1<Ki 
(0.210) 
UriH.nl 


1 .4 1 8 
(0.21!)) 
101.117 


1.303 
(0.223) 
i 0-1(1.37 


1 .328 
(0.230) 
1018.33 


1,132 
(0.232) 
1030,1-1 



I lie estinialc,] stainl.ilil errors are Riven in parcnthi* 



ft = (X T X)"X T Y. m 
where X is the .V x /, (population) matrix of t |,e repressors (the population design matrix), and Y is the 
A' x l v ,. rf()| . of the outcomes for the population. Karl, total of crossproduCs it! (31) eat, be estimated lu- 
ll)- eorrespondmg rat io est ima.or and t he sample en variances ran he obtained fro,,, I he snmpliu K variance^ 
u.Miiff tb<< formula 

2 

when F,.F; denotes the ratio estimator of the population mean prod,,.-, of .V,AV The expectation and 
the variance matrix nf'the sample counterpart of (31). 

••an he approxitnaled by the multivariate version of the dell; 



1 lilt t Mod. |,et 



fl = («a + e 2 r , tui +ciJ. 
where an- I he expectations and e h the deviations fro,,, the expect at ions of t he numerator (h = 1 ) and 
<l"no,„i,,ator (h = 2) in (31). We denote the covariance matrices for row /, of with r, by 27,, ,, = 
cov(e, ,,. e ,). f„r (wo rows of e 2 by S = cov(e, t ., and set 27, = var(*,). 
.•Wining i!,at | e-, | is much smaller than u L , we have the expansion 
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Table 1 1 : Regression analysis of the proficiency on a constructed variable using the model-based method: 
New Jersey data . 



Parameter 






( 'orre 


lation 






0 


0.2 


0.1 


0.f> 


0.K 


1 .0 


lilt (MC.-pt 


2251.4.1 


22'.!.:S7 


221). 29 


2211.21 


;>■>') }:>, 


220.01 


Slope 


1,1 2(s 


l. i:so 


1,1 :i2 


l.Ril 


1 Mix 


1,111 


(d.252) 


(0.212) 


(l).2.'!2) 


(0.221) 


(0 20S) 


(o.ioi) 


Res. variance 


lO.'KMO 


llV.Sii. 12 


I():i0,l-1 


10:iG,l() 




lo:Hi,"jO 



Notes: Tlie constructed variable is descrilwd if Section (i,!. The results are given for imputed values of tlie 
< orrel.uioii 0. II..' I . 



0 = (I - Uj 1 £■„/ -r 11.," 'e-jll.j' £•_. - . . .) (n.j'llj + ll.j'e; ) 
~ u.7 U| +u-T ! £i — u-7 e-jii-7 1 iij - u.j'eati.7 £i ■ 

and mi the mean and variance matrix of /J are approximated as 

E(/J) ~ u.J 'it, - u.J 1 {tr (u.j'r LM .,) ,tr (uj'^-M ■>) t r (u.J 1 ,.] { 

var(fJ) ~ u.J : (£• - A - A T + b) u.J : . 

v. here A is the j . / j< matrix with e'nhmiiis E->\ >u7 1 ui . and B is tlie j, ■■ j, matrix with elements 
ir Iu.7 : uiufu7'r:. u ). 1 f h.h' < p. 

For niuhilevel regression a number of data sii inmaries. various within-cluster sums of" square? and 
crossproducis. are required . An alu,orii Inn for maximizing f he log-likelihood for I he population can be 
used, with tli.' population quantities replaced by their sample counterparts. The equation* for such an 
algorithm involve complex functions of the summaries, and. as a consequence the delta method leads to 
unwieldy equal ion?, in particular for estimation of variances and covariances. Xevefth" less, the outlined 
approach can easily be applied without bias correct I oil and derivation of the sampling variance matrix of 
1 he e>t imatofs. The impact of l If unknown covariance or correlation matrix can be explored by imputing 
ne\eral extreme s.ilues of the mairix. such as the matrix of zeros or other singular maimer. Hie same 
appr"ach can. in prim iple. be applied to structural model equations and factor analysis. 

7 Two-stage clustered sampling design 

.,' his section derives (he equations for the method of 1'ottliolf il «/. (1002) for the two-stage (ihree- 
|e\e|) clustered saniphnc, design. It arisi s, for instance, when the replicate groups in the 1000 Math 

i.l 
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Trial State Assessment data arc* associated with variation. The SAKV Assessment for USA employs a 
stratified probability clustered sample with two stages of clustering. This section extends the model-based 
methods to such sampling designs. We present first tin- details for the two-stage clustered design with no 
stratification. Incorporat ing stratification is relatively straightforward because it merely corresponds to 
collating independent information across tin- strata. 

We use t he ten n 'group' for t he same aggregate sail 1 tiling units (replicate groups) as above, even though 
these units are now associated with iiutdum 'arialion. We consider the model 



where {o'.} t .. {i\<.) ; ,(.. and {:",_,;.■), .".re 2+ ,Y-_> mutually independent random samples from centered dis- 
tributions with respective variances J- . r"-'. and { Vjj. }j *■ ■ Thcsr variances are referred (o as varuinct 
mm pain »/,->. The covariancc of t wo ohsorvat ions in t he same cluster is r 2 + J 1 and the covariance of two 
observations from the same group but different clusters is*'*, Since- t hese covariances can. in principle, be 
negative, it is meaningful, though counterintuitive, to consider negative 'variance-' components, so long as 
the vaiiance matrices t'or the sample and the target population an- non-negative definite, 

'1 his model differs from that for the stratified clustered sampling only by the assumptions for the group 
deviations o/. . In Section 'A A these deviations are assumed la 'be unknown constants (fixed), whereas 
here we assume them to he a set of i.i.d. random variables. As in Section we do not assume specific 
dist rib ut ions for t be random variables in We focus on t lie rat io est i ma tor of t he popul t ion mean for 
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E, '■' it ./■ ■ analog, in*, hi ,i a i ii in. t he populat ion weighted sample mean is expressed as 
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where «v,* = H"Ut/Ej »f = E* "r.A - for /f = l - 2 - aII(J lr = E* ir <- • 

Further, we denote // t = /? + o A . and /j,* = /i + «a- + (<]k, so that and /i k arc ostimators of the 
realized values of [ijk and fit. respect ively. These estimators are conditionally unbiased given fijk and /u-, 
respect iveiy. 

1 he iiKiiueiit met hod of est iniat ing t he variance components ~' 2 . r", and { } ; a- is based on the following 
AXOVA-Iikr (weiglired) sums of squares: 



_ E. *> A,i ktil, k - I'jk-)- 

<.\.jk - ; 

»A,jk ~ 1 

vn.k = Yl "jkif'jk - in)' (53) 

3 
A- 

The sets of weights {ujk} and {1/1} can be arbitrary non-negative numbers: their choice is ^ .cussed below. 
First, we evaluate the expectations of these statistics as (linear) functions of the variance components; then 
linearly combine them and solve the resulting moment equations by setting these statistics equal to their 
expert at ions. 

For derivation of the expectations of the sum-of-squarrs statistics in (53) the following identities are 
useful : 
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where iii- = »>; (. is llic M'lifMil— »;uni»ti- si/i' 111 u. r • > 1 1 1 > <■ Nute llial llir decrees u|' Freedom' appear in lliis 
c<]unl i< in. The idenl ii ies 
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For example, t lie coefficient of t~ in (59) is equal to 




When the group-ievd totals of weights are constant then (59) further simplifies to 

e«v.) = (A--i,L> + f Y.^Zr^-'h) • 

where /\' is the number of groups. 

7.1 Estimators of the variance components 

The within-school variances are estimated by t'A.jk ■ If schools share yrithiti-scligol variances, the statistics 
t'A.jk ca[ i DfJ combined to form an estimator of the common variance. For instance, if all schools within a 
group have a common within-school variance a\ then 

is an unbiased estimator of a~ k . 
In general, we have 



k i 

k J 

where the coefficients D/j.k- Ba.jk- and ilfjt are as in (55) an<| (58), or their special cases. Moment 

estimators for (he school- and group-level variances are the solutions of the pair of liner- • equal ions 



k k k j 

IT = /V-' 2 + ^Wr, t -r 2 + ^^/; r . J ,.7;', , (00) 

k k j 

where are a «et of non-negative constants; 



E* s * .* - T.k ** E; a ik 

T~ — — — 

Et -n-W/r.i- 

- Et Ot\* r a - Ei Ej Di:jt a } f 
= — . . t 6D 

The weights UT.i appear to be natural choices for the coefficients 5* . The e^timaf ex of the variance- compo- 
nents are then substituted for the true values in (54) to obtain an est ijnfl.tr* pf t he sampling variance var(/i) 

50 
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of the population mean estimator //. ( sing different sots of weights for the moment matching equations, 
estimating some of the variance components by smoothing techniques invoking multiple regression opens 
up a variety of possibilities which require further research. 

7.2 Stratified two-stage clustered design 

For purposes of st at ist ical analysis the design of the N A Y.V surveys is described as a stratified two-stage 
weighted sampling design. Thus, within each group a small number, usually two. primary sampling units 
(PSU) is 'selected' (m reality, a different stratification is used, but after selecting the primary sampling 
units they are paired into replicate pairs of PSl's). Then clusters (schools, or 'consolidated" schools) are 
selected from each selected PSV. and finally students are sampled from each selected school. 

Since wit bin each group we have a two-stage clustered design, the estimators of the variance components 
and of the sampling variance of the mean carry over to st rat ified design, with suitable weights for combining 
the wit hin-st raturn estimates of the variance components. 

8 Summary 

The report describes a class of model-based methods for estimation of the population moan in stratified 
clustered sampling. Importance of the adjustment of weights is assessed by hi\ approach considering the 
sampling variation of (he adjusted weights and its (variance) components. The methods are non-iterative-, 
and the resulting estimators are more efficient than the jack-knife estimators for a variety of datasels 
obtained from the 1990 Math Trial State Assessment . Tin- methods can be extended to two-stage clustering. 
A general method for estimation of more- complex population summaries, such as regression coefficients, is 
outlined. It i-> based on the estimators of the population means, applied to various quadratic functions of 
the explanatory and outcome variables. There are no distributional assumptions in model-based methods. 
apart from normality of the sample means. Model-based methods use only the final adjusted weights: the 
replicate weights can be disposed of. thus radically reducing the size of the dataset and simplifying <hila 
handling procedures. The principal advanl age of ( he model-based methods is in efficiency and small bias 
of the estimators of standard errors for the population mean. Contrary to theoretical claims, the (NAF.P) 
opcral ionally implemented jackknifc est ituator of t he sampling variance is not unbiased . 
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9 Appendix. Data analysis with Splus 



This section describes mid documents functions and programs written in Spins for processing and analysis 
of HUH) Math Trial State Assessment data. 

9.1 Data input 

The data were obtained from the I'ser Tapes, and in the process of t nmsfWring the.u to the Sun- Workstation 
on which She analysis would take place only a subset of 120 variables was selected. They included all 
the resampling weights, plausible values for the composite proficiency scores, information about student 
background, and a subset of responses to cognitive and attitudinal items. The file containing this dataset 
is named N.T.dat fur \ew Jersey, and similarly for other states. 

The data an- input into the Splus environment by the following Splus expression:-,: 

Ncols _ 120 

NJdat _ matrix(scan("NJ.dat"),ncol=Ncols,byrow=T) 



I he function scan 'scans', or roads, the dataset and temporarily stores it in a vector; by default, (wo 
numbers are separated by one or several spaces, one or several carriage returns, or t heir combination. This 
default can easily be overruled. The function malm with the argument lnjrvw=T 'shaped tins vector into 
a matrix with Xols columns by filling up its elements row by row. There are reasonable defaults if the 
vector has a length which is not a multiple of X<<>!.% and the user is informed about it by a rawing. 
I he number of students in the data is ascertained as the number of rows of the matrix Xldat: 



Nstud _ dim(NJdat) [1] 

Next, we M,rt the students by the schools and by sirata. The scalar* Xjack and Span- nr* the indices 
(column i> : :.'nb, T N) f„ r the stratum and th- school within the Mraluni. Knowing that I her,- are at most 
Ihri- s.-hooU within a stratum, I lie students can be sorted on the variable 

.'i x stratum number + school number wit Inn strat inn . 

first . to rase the burden of typing complex expressions we define an Splus function r/s: 

els _ functional ,c2) 
NJdat [, cl] y.*y.matr ix ( c2 ) 



[his function h; lf , Uv „ an/annuls: cl should he a %•<■<•! or. a iist of column indices of XJdtit. and c2 a vector 
of the same length as rl. The function returns the linear combination of t he columns rl of Xldat with 
coefficients rl. The vector <-) has to be reshaped into a matrix because Spins distinguishes between veetors 
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and matrices with a single column or row. The value of I he function rfo is a vector even though it is 
venerated by a matrix operation. 

Sorting of the data is accomplished by the following expressions: 

# Npair and Njack are the column, indices 

# for the replicate and jackknife 

Npair _ 12 
Njack _ 11 

# sort records by schools 

NJdat _ NJdat[sort.list(cls(c(Npair,Njack),c(i,3))),] 

The hash '#' causes the' rest of t he line to be regarded as a comment.. The function .sttvt.hhl returns the 
permutation that would sort its argument in ascending order. This permutation is then applied to the rows 
of the matrix .XJrittt, The selection of columns of NJdat can be affected by an expression between the last 
comma above and the closing bracket ']'. No expression behind the comma is interpreted as 'all columns', 
lor convenience, it is useful to calculat e l he delimiters of t he schools in t he data. We index the schools 

by integers l . '1 wh'Te school 1 is represented by records 1.2 school '2 by )i\ + 1 ii] + n-,. 

and so on. W'e refer to r/j as the school sample sizes. 

Bot _ seq(Nstud) [ ! duplicated(cls( c (Npair ,Njack) , c( 1 , 3) ) )] 
Top _ c(Bot[-l]-l .Nstud) 
Cnt _ Top - Bot + 1 
Nclu _ length(Cnt) 

In this sequence of expressions liftl is assigned all the indices (components of t lie vector 1 . '1 \'*hitl), 

for which any value of the linear combination of rr-plicat ion and jackknife indices occurs for the first time. 
The exclamation mark '!' stands for negation, and the function duplicated returns a logical vector (vector of 
7"s and A"s) indicating whether the value of the component of the argument is equal to that of a previous 
component . Thus. Hot contains the indices of the first students from each school. Top is set to the indices 
of the hsl students from each school. Cut to tin' number of students from each school, and Xrlit to the 
number of schools in t he dat asel (U niftli ret urns t he number of components). 

9.2 A data summary 

The following expressions give the selected variable names and tabulate the categorical variables and 
compute quantiles and standard deviations for the quantitative variables. Mole how the function pn%lr is 
used for generating a set of similar names (character strings). 




# Tabulating the NJdat data 

# Tabulation 

Vnames _ c( "Sex" , "School" , "Race" , "IEP" , "LEP" , 

"D . Sex" , "D . race" , "Urb . Stratum" , "Hin . Stratum" , " Inc . Stratum" , 
"Rep . Grp . 1 " , "Drop . Grp" , " JK . Fac" , "Weight" , 

paste ("SRWT" ,seq(l, 56)) , 
"Orig . WT" , "Num. Cor" , "Par . Ed . " , "Single . P" , "Sch. Math" , 
"Perc .Math" , "T. Certf " , "T.Und.Mj" , "T.Grp.Mj" , "T .Math. Crs" , 
"T. Emph . No . " , "T . Emph . PS" , "S .Policy" , "Problems" , "B003501A" , 
"B003601A" , "B000901A" , "B000903A" , "B000904A" , "B000905A" , 

paste("M",10100+seq(l,8) ,"B") , 
"M810201B" , paste ("M" , 10300+seq( 1 ,3) , "B" ) , 

paste("MPRCMP",seq(l,5)) , 

paste ("T023" ,c (201, 301 ,302,311,312,307,308,313,401,402, 
411 ,412,407) ) ) 

length (Vnames ) 

quanti _ c( 1 , 2 , 3 , seq( 13 , 72) , 75 , seq( 103 , 107) ) 
categ _ seq(l , 120) [-quanti] 

MJTAB _ list() 
for (i in categ) 
NJTAB [ [i] ] _ TBL(i) 

for (i in quanti) 

NJTAB[[i]] _ c(mean(NJdat [, i] ) , quantile (NJdat [, i] , 
c(0, .1, .25, .5, .75, .9,1)) , sqrt (var (NJdat [, i] )) ) 

"Done" 

The cntiii'in of lisl XJT.Ali is. naturally quite extensive, and therefore we reproduce only a small 
section ofii. liivina, ilie summary lor a categorical and a quantitative variable. 



$"Sch.Math , No. 75": 
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Mean Minimum 10'/. 25'/. Median 75'/, 90'/, Maximum St. Dev. 
3371.3 -25021 -14404 -269 4920 11410 14975.5 19804 10445.4 



$"Perc.Math , No. 76": 
1 2 3 9 
725 1392 555 38 



Variable No. 7") if the School level math mean logil score (multiplied by 10000). No. 189 on (he User 
Tapes, ami No. 76 is Student's perception of mathematics. No. 190 on the I'ser Tapes. 

9.3 Jackknife 

In this section we present an Spins function for jackknife estimation of snhpopulat ion means. Specifically, 
we consider estimation of the mean proficiency for a subpopulat ion given by a condition described in terms 
of the variables in the dataset. The vector of values of this (logical) variable on the sampled students is 
the argument of the Spins function Jackf. The default argument is 7'. that is. the entire population. 

The function starts with extracting the dataset for the subsample corresponding to the subpopulation 
(.V/f/). and the sample size of this subsample \nSlu). The permanent assignment symbol '<< — has (he 
effect of its left-hand side to be written in the directory available at the entry into Spins. Other objects 
created within tin 1 function are temporary: with the exception of the last expression of the function they are 
nut available after t he fund ion is successfully evaluated. If an error occurs during evaluat ion. t he directory 
remains int act . 

The objects hot, top and cnt art? the analogues of I he vectors Hot. Top and Cnl for the subsample. The 
object '/'»•/ (totals of the weights) is a vector of length Xjr (number of strata + 1). and its components 
an' the total of adjusted weights (the first component), and the totals of the jackknife replicate weights fur 
each psoudoanalysis. Or? is 1 lie index of the adjusted weights, and the adjusted weights are followed by 
the jackknife replicate weights in each record of the dataset. 

Tile permanent assignment of Tirt. as well as o r XJd. is essential because these objects are used in 
another function: the function jknnan evaluates the sample mean or the jackknife pseudo-means for a 
set of plausible values. The indices for the analysis (sample or jackknife) and for the plausible value are 
encoded ill the argument j. The function jkmean is used via the apply function to create tin- vector of 
these iiu-ans. Tlie function apply has three arguments; an array A, an integer /. and a function /. The 
function / is applied lo each subplane of the ;-th dimension of .-1. In our case. ,1 is a column vector of 

integers 0. 1 Yjr x Xpr — 1 {Xpr is the number of plausible values, equal to r >). Fur a given integer 

the associated pseudo-analysis and the plausible value an 1 given by integer division {'/< /V( ) and remainder 
( ( /c'/(). respectively; see the declaration of t he function jknttan, Tin- values relumed by the apply function 
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using jkiuain arc reshaped into a Xjr x .V/ic matrix: it contains the sample and |>s« 'in [t.- iiuMin?, (rows) f ( .r 
tin- sets of plausible values (coiunins). 

['he function .s.sr/ colVcis the results of tlir Xpr jaekknife analyses: the sample (weighted) means, the 
jaekknife estimates of t lie moans, and t he jackknifr estimates of the sampling variances. Since the decimal 
placing was ignored at the input, division hy 100 and 1 0 000 Urines the results onto the appropriate scale. 
I he function is then applied (o each column of xbnr, that is. for each plausible value, thus generating a 
H x Xjir matrix. 

The matrix stars is augmented hy the means across the plau.sihle values and its third row (sampling 
variance) is augmented by the observed variance of the sampling variances. I'mally. the variances are 
I fans fori lied io standard errors (deviations), labels are attached to the rmvs ami columns of w«rs. and I he 
function returns a list containing: 

• li'.nnbers of student* and schools; 
» counts of st udents wit bin schools; 

• t he mat rix sran; 

• user, system and elapsed times for (-valuation of the function (note t hat tin- value of s!<ni is assigned 
by the lirst expression of the function), 

1 he expressions e< nisi it i.- ug the function jhmuiii are enclosed in braces { f; for functions euutainim; a 
single expression, such as jkmniH and .<„".//, the braces are redundant. 

I he following is an .•xample of using t he function jhman. 'I")i«> jackknife analysis is performed for \ he 
subpopulation of all students who (would have) responded 'A' ('strongly agree') i,, t| !( . question about 
student's p-Tccp! ion of mathematics (variable \o. !!)0 in the us<t tape). 

JKre762 _ Jackf (KJdat [, 76] ==1 ) 
JKre762 

1 lie first expression is an assin nt; the Second, .plotiug the mmie of t he objol . displays I lie ohjecl 

given below: 

$Students . clusters : 
[1] 725 104 



$counts : 
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[101] 4 8 3 9 



$estimates : 

P.vl. 1 P.vl. 2 P.vl. 3 P.vl. 4 P.vl. 5 Ov-11 

Weight, mean 279.837 279.592 279.516 279.910 279.732 279.717 

JK. mean 279.828 279.579 279.501 279.905 279.718 279.706 

JK. st. err. 1.681 1.707 1.722 1.659 1.727 1.707 

'i'liiirv l Iff siib.sniiipl'' I'tmtaitis students from 101 schools (cadi school is represented in the Milisam- 
plr). Tin- menu [>r< .fici<nic y is "279.71 - the difference between t he jackknife and ratio (weighted) estimates 
is trivial Tlio est in la ted standard error of t he jack knife estimator is 1.71. 



# Jackknife analysis for subsets 

# cases selected by seval 

# 1990 New Jersey state trial assessment 

# The function requires only the dataset NJdat 

# and the logical vector 

# for the selected subset, e.g. Jackk(NJdat [, 74]==2 ) 

# ( ! ! ) or the vector of subscripts 

# Constants to be set 

# Cpr . . . the first column of plausible values 

# Cpr = 103 

# lp ... number of Plausible Values 

# lp = 5 , and Ip=seq(l,lp) 

# Cwt . . . the column o': weights (followed by the JK weights) 

# Cwt = 14 

# Njr . . . number of r 'SUs 

# Njr = 57 

# The default is the entire dataset (2710 students) 



Jackf _ f unction(seval = T) 
{ 
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start _ proc.timeO 

ft select the students 

HJd «- NJdat[seval, ] 
nStu <- dim(NJd) [l] 

it the delimiters for the clusters 

bot <- seq(l, nStu) [ !duplicated(NJd[, c(Npair, Njack)] '/.*'/. 

matrix(c(l, 3)))] 

top <- c(bot[-l] - 1, nStu) 

cnt <~ top - bot + 1 

# the total weights 

Twt «- a pply(NJdC, Cwt:(Cwt + Njr - 1)], 2, sum) 

# analysis for each replicate and plausible value 

jkmean <- function(j) 
{ 

jj <- j '/.% (Njr) 

sum(NJd[, Cpr + j */./'/. Njr] * NJd[, Cwt + jj])/Twttjj + 1] 
} 

# Jackknife means 

sbar <- matrix(apply (matrix(seq(0, Npr * Njr - 1)), 1, jkmean) 
Njr, Npr) 

# jackknife results (means and variances) for each pi. value 
ssq <- function(yb) 

c(yb[l]/lOO, mean(yb[-l] )/100, sum((yb[-l] - yb[l] ) "2)/lO0O0) 
svars <- apply(sbar, 2, ssq) 
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# variance standard error 



svars <- cbind(svars, apply (svars, 1, mean)) 

svars[3, lp + 1] <- svars [3, Ip + 1] + 1.2 * var(svars[2, ]) 
svars[3, ] <- sqrt(svars [3 , ]) 

dimnames (svars) <- list(c("Weighted mean" , "JK. mean", 
"JK. st. err."), c(paste("P . vl . " , Ip) , "Ov-11")) 

# results 

list (Students . clusters = c(nStu, length(cn'c ) ) , counts = cnt , 
estimates = svars, proc . time=proc . time( )-start ) > 

9.4 Model based estimation 

The Spins function \'('mg listed below executes the set of model-based estimation procedures described in 
Sect ion .'i. l. 

# Function for model based estimation of NAEP data subsets 

# Filename NS.md (started 6/8/92) method based on 

# Potthoff et al. JASA, 1992 

# cases selected by seval 

# 1990 New Jersey state trial assessment 

# The function requires only the dataset NJdat 

# and the logical vector 

# for the selected subset, e.g. Jackk(NJdat [ ,74]==2) 

# (!!) or the vector of "iota 

# Use function TBL to tabulate a NAEP variable 

# Constants to be set 

# Cpr . . . the first column of plausible values 

# Cpr = 103 

# lp ... number of Plausible Values 

# lp - 5, and Ip=.?eq( 1 , lp) 
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# 



Cwt 



the column of weights 



# 



Cwt = 14 



# 



Njr . . . number of PSUs 



# 



Njr = 57 



# The default is the entire dataset (2710 students) 

# STR . . . column of the stratifying variable, default . 



# other option . . . STR = 9 (4 strata) 

VCmg <- f unction(seval = T,STR=11) 
{ 

start <- proc.timeO 
S select the students 

y <- NJdat[seval,c(Npair,Njack,STR,Cwt,Cpr+Ip-l)] 

str <- y[,3] 
nStu <- dim(y)[l] 

fl the delimiters for the clusters 

bot <- seq(l, nStu) [!duplicated(y[, c(l,2)] '/.*'/. 

raatrix(c(l, 3)))] 

top <- c(bot[-l] - 1, nStu) 

cnt <- top - bot + 1 

Ncl <- length(cnt) 

clu <- rep( seq( 1 , Ncl ) ,cnt) 

w <- y[,4]/1000 
y <- y [,4+Ip] /100 

fl stratifying variable (student- and cluster-level) 

# rocode to strata 1,2, nstr 

cstr <- unique(str) 
str <- match(str , cstr) 
Str <- str[top] 



# 



STR = 11 



fill 



70 



nstr <- length(cstr) 



# the delimiters for the strata 

Sbot <~ seq(l ,Ncl) [ !duplicated(Str )] 
Stop <- c(Sbot[-l]-l,Ncl) 
Sent <- Stop-Sbot+1 

# total weight 
W <- sum(w) 

# sample means 

Wmn <- t(y)'/.*7,raatrix(w)/W 
it within-cluster means 

tcls <- cbind(tapply(w,clu,sum) , tapply(w"2, clu, sum) ) 
for (i in Ip) 

tcls <- cbind(tcls, tapply (y [ , i] *w , clu, sum)/tcls [, 1] ) 

# normalized weights and effective sample sizes A 

# estimate of the within-cluster variance 

nA <- tcls[, lj -2/tcls[,2] 

S2w <<- matrix(0,Ncl,lp) 
for (i in Ip) 

S2w[,i] «- tapply((yC,i]-tcla[clu,2+i])"2*Bf, clu, sum) 
tcls[,l]/tcls[,2] / (nA-1) 

S2W <- apply(S2w[nA>l,] ,2 ,mean) 
S2w[nA==l,] «- S2W 

# within-stratum totals, only when more than 2 PSUs 
Snb <- S2w/matrix(nA ,nrow=length(nA) ,ncol=lp) 

O (I 
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Snb <- Snb [Sbot ,] +Snb[Stop,] 

SnA <- nA [Sbot] +nA [Stop] 

Swt <- tcls [Sbot , l]+tcls [Stop, 1] 

# between-cluster sums of squares 

difs _ (tcls[Sbot,-c(l,2)]-tcls[Stop > -c(l,2)])-2 

v2I <- t(dif s)'/.*'/.matrix(Swt) 
v2II <- t(dif s)'/,*'/.ir.atrix(SnA) 
v2III <- apply(difs/Snb,2,sum) 

# between-variance estimates 

BI <- (v2I - t(Snb[Sbot!=Stop,] )'/.*'/,matrix( 
Swt [Sbot ! =Stop] ) ) /sum(Swt [Sbot ! =Stop] ) /2 
BII <- (v2II - t (Snb [Sbot !=Stop,] )'/,*'/. 
matrix (nA [Stop] +nA[Sbot] ) [Sbot ! =Stop,] )/ 
sum(nA [Sbot ! =Stop] ) /2 

Bill <- (v2III - sum(Stop!=Sbot))/apply(i/Snb[Stop!=Sbot,] , 
2,sum)/2 

# reestimate 

Swr <- l/(matrix(2*v2I,nrow=length(Sbot) ,ncol=lp,byrow=T) + 
Snb) 

v2R <- apply (difs*Swr, 2, sura) 

BR <- (v2R - apply((Swr*Snb) [Sbot ! =Stop,] ,2,sum) )/ 
apply (Swr [Sbot ! =Stop,] ,2,sum)/2 



# within stratum totals 



stra <- rep(seq(l ,length(Scnt) ) ,Scnt) 
ucls <- cbind(tapply (tcls [, 1] , stra , sura) , 
t apply (tcls [ , 1] "2 , stra, sum) ) 

for (i in Ip) 




ucls <- cbind(ucls , tapply (tcls [ , 1] *tcls [ , 2+i] , stra, sum) / 
ucls[,l]) 

difs <- (tcls [, 2+Ip] -ucls [stra,2+Ip] )"2 

# estimates based on W_jk and nA 
Excl _ -Sbot [Stop==Sbot] 

Wcora <- tcls[,l] - tcls [, 1] "2/ucls [stra, 1] 

vbl <- t(difs) '/,*'/. matrix(tcls[,l] ) 

CI <- (vbl - t(S2w) , /.*'/,matrix(Wcora/nA))/sum(Wcora) 

Wcora <- 1 - 2*tcls [„ 1] /ucls [stra, 1] +ucls [stra,2] /ucls [stra, 1] 
vbll <- t(difs) '/,*'/. raatrix(nA) 

CI I <- (vbll - t(S2w)'/,*'/,matrix (Wcora) )/sum(Wcort!*nA) 

vblll <- t(di£s [Excl,] )'/*'/.matrix(nA[Excl]/Wcora[Excl] ) 
CIII <- (vblll - apply(S2w[Excl ,] , 2 , sum) )/sum(nA [Excl] ) 

# reestimate 

Snb _ l/(raatr ix (vbl ,nrow=length(nA) , ncol=lp,byrow=T) * 
S2w/matrix(nA ,nrow=length(nA) ,ncol=lp) ) 

vbR <- t((dif s*Snb) [Excl,] )*/.*'/,raatrix(l/Wcora[Excl] ) 

CR <- (vbR - apply( (S2w/raatrix(nA,nrow=length(nA) ,ncol=lp)* 
Snb) [Excl,] ,2 , sum) ) /apply (Cub [Excl ,] ,2, sum) 

# variance of the weighted raean 
nB <- W"2/sura(tcls[, 1] "2) 

varl <- l/nB*(BI + t(S2w)'/.*'/,matrix(tcls[ ) 2] )/ 
sura(tcls[,l]"2)) 

varll <- l/nB*(BII + t(S2w)'/.*'/.raatrix(tcls [,2] )/ 
sum(tcls [, 1] "2) ) 
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varlll <- l/nB*(BIII + t (S2w)'/.*'/.matr ix (tcls [ , 2] ) / 
sumCtc'ls [, 1] "2) ) 

varR <- l/nB*(BR + t(S2w)'/.*'/.matrix(tcls [,2] )/ 
sum(tcls[,l]~2)) 

varCI <- l/nB*(CI + t (S2w)'/,*'/.matrix (tcls [ , 2] )/ 
sum(tcls [, 1] ~2) ) 

varCII <- l/nB*(CII + t(S2w)'/,*'/.matrix(tcls [,2] )/ 
sum(tcls [, 1] "2) ) 

varCIII <- l/nB*(CIII + t(S2w)'/.*'/.matrix(tcls [,2] )/ 
sum(tcls[,l]~2)) 

varCR <- l/nB*(CR + t (S2w)'/,*'/.matrix (tcls [ , 2] ) / 
sumCtcls [, 1] *2) ) 

# results 

indx <- 0(2,3,4,5,6,7,8,9) 

resm <- matrix(c(Wmn , varl , varll , varlll , varR , varCI , 
varCII, varCIII, varCR, S2W, BI , BII, Bill, BR, 
CI, CII, CIII, CR), ncol=lp,byrow=T) 
resm <- cbind(resm , apply (resm , 1 ,mean) ) 
resm[indx, lp+1] <- resm[indx , lp+1] + ( 1+1/lp) * 
apply(resm [indx , Ip] , 1 , var) 
resm[indx,3 <- sqrt(resm[indx,] ) 

dimnames (r esm) <- list( 

c ("Weighted mean" , "SE I", "SE II", "SE III", 
"SE R" , "SE CI", "SE CII" "SE CIII", "SE CR" , 
"Within variance", "BV I", "BV II", "BV III", 
"BV R" , "BV CI", "BV CII", "BV CIII" , "BV CR"), 
c(paste("Pl . val. " , Ip) , "Overall")) 

list (Students . schools . strata = r'nStu, Ncl, nstr), 
school. sizes = cnt , strata=cstr, PSUs . in . strata=Scnt , 
Efi.size - nB, estimates = resm, 
proc . t ime=proc . t ime( ) -start ) 
} 



! arguments ofibe function arc the siibsample (given by fi l»tiiral variable, with tin- entire sample a-> 
tljr fit Can 1 1 ). and 1 In- > tr.nl ilirat ion variable ( • I « - f ; \ n 1 1 variable \'o. II. variable No. at*, mi tin- ' '['a pes I. 
Only I he minimal * . of variables is selected into the array //. Then I he decimal place., f. rr ) he weights and 
plausible values are adjusted. The object Ip is the vector f 1.2.3. 1..")). defined as vf/f7/>,<. where //; is t he 
number of plausible values, equal to a. 

I he delimiters for the . lusters are set as in jknuan bin. additionally. similar delimit er- are set for th.- 
strata (with respect to clusters). Some care is necessary because clusters and even whole strata may absent 
in t he subsatllple 

"I lie array ici> contains the wit hin-chisier totals of weights, squared weights, and weighted totals of 
plausible values: nA are the within-cbister effective sample si/es. and .S'.'ll' is the matrix of estimates of 
the wit bin-chisler variances er^- . The noiatiou for the sum of squares and variance estimator is similar to 
that in the text The m-'lhod ba.-od on pairs of PSl's tise> the objects Sua. Sul. rj. : JI. rjll. and so on. 

The oilier two methods require within-stratum totals of weights and means of plau-ible values. These 
are collected in t he mat fix t/W.s. 'I" he object s '/>, rhl, ihlt. ami so on. are the < < >unter parts i >f rj am! rjl. rjll 
for t he two met hods. The sets of five ( //;) est in tales are t hen summarized in the las, column of the mat fix 
tr.sm. '["he out put nf the function is a list containing the clustering structure of the analyzed subsample, 
t hi* est iinales, and informal ion about processing lime. 

for illust rat ion we give I he analysis for response 'A" to l he item Slinh nt\ /» n< ptit-u of null In main v 



hi r 
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$Students . schools . strata: 
[1] 725 104 53 



$school . sizes : 



[1] 


8 


4 


4 


10 


6 


12 


7 


10 


5 


3 


9 


7 


13 


4 


6 


6 


11 


6 


1 


14 


[21] 


9 


8 


11 


5 


3 


14 


12 


4 


2 


6 


8 


12 


11 


12 


4 


6 


5 


5 


6 


10 


[41] 


6 


7 


2 


7 


13 


7 


13 


5 


9 


3 


5 


4 


5 


4 


2 


3 


10 


11 


8 


3 


[61] 


11 


3 


6 


6 


5 


4 


14 


6 


5 


7 


2 


8 


6 


6 


10 


11 


6 


8 


7 


10 


[81] 


7 


11 


7 


12 


8 


5 


5 


2 


2 


4 


4 


5 


8 


15 


9 


6 


5 


5 


9 


5 



[101] 4 8 3 9 
$strata: 

6 51 102 149 196 244 295 359 410 476 492 541 600 652 677 732 792 

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 

8i3 868 926 968 989 1045 1101 1147 1195 1273 1301 1347 1404 1445 

18 19 20 21 22 23 24 25 26 28 29 30 31 32 

1498 1551 1598 1645 1725 1768 1809 1859 1915 1963 2013 2067 2125 

33 34 35 36 37 38 39 40 41 42 43 44 45 

2182 2253 2283 2337 2388 2475 2523 2587 2637 

46 47 48 49 50 51 52 53 54 



$PSUs. in. strata: 

[1] 222222222122212212221222232222 
[31] 22222222222222222222223 



$Ef f . size : 
[1] 74.536 



$estimates : 

P.vl. 1 P.vl. 2 

Weight, mean 279.837 279.592 

Wtd. st. err. 1.537 1.572 

Betveen-var. 102.215 112.575 



P.vl. 3 P.vl. 4 P.vl. 5 Ov-11 

279.516 279.910 279.732 279.717 

1.525 1.440 1.536 1.531 

105.488 81.197 104.311 10^.157 
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on 
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W.st. err II 1.497 1.5S3 

Btwn-var. II 93.294 108.237 

W.st. err. R 1.716 1.694 

Btwn-var. R 145.696 142.324 

Within var. 643.821 614.549 



1.515 1.428 1.520 1.511 

103.124 78.695 100.657 96.801 

1.639 1.530 1.665 1.670 

132.325 101.063 135.083 131.298 

613.602 679.647 623.104 634.944 



$proc .time : 

[1] 36.416 4.517 48.000 0.000 0.000 



Note that t he effective sample size is 74.5, much smaller t han the number of duslers, 101. The first two 
estimates of the standard error are nearly identical (l.oli and 1.51}, hut differ appreciably from the third 
one (1.67) which is close to its jackknife counterpart (1.71). 

For the entire sample there is a much closer agreement . The jackknife estimate of i he standard error is 
1.05, while its model-based counterparts are 1.09, 1.10. and 1.085. 

9.5 Regression 

In this section the Spins functions for fitting regression by the jackknife and model-based methods are 
given. The vector rscl identifies the students with complete records (for each variable using in constructing 
the explanatory variable), and xvar is the constructed explanatory variable. The function JKng returns 
the resi.dts of a pseudoanalysis. These results are stored in srxg and are returned in a suitable format in 
the list JKfit. 

# An example with regression 

# Select all the subjects 



rsel _ (KJdat[,91]<6)&(NJdat[ ) 92]<6)&(NJdat[,93]<6)& 
(NJdat [, 94] <6)&(MJdat [ ,95] <6)&(NJdat [ , 96] <6)& 
(NJdat [ ,97] <6)&(NJdat [ ,98] <6) 

xvar _ NJdat[ ) 91]+NJdatC,92]+NJdat[,93]+NJdat[,94]+ 
NJdat C , 95] +N Jdat L" , 96] +N Jdat [ , 97] +N Jdat [ , 98] 



JKreg _ function(j) 
{ 

jw _ j'/.'/.Njr 
jy _ jTOjr 
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BEST COPY AVAILABLE 



# regression ol pi. value jy with weights jw 



ft _ lsf it(xv,NJd[,Njr+3+jy] ,NJd[,3+jw] )$coef 

matrix (c (ft ,sum(NJd[,3+jw]*(NJd[,Njr+3+jy]-ft [l]-ft [2]*xv)'2)/ 

sum(NJd[,3+jw] ) ) ) 

} 

ssq _ function(yb) 

c(yb[i] ,mean(yb[-l] ) , sum( (yb[-l] -yb[l] ) "2) ) 



start _ proc.timeQ 
xv _ xvar[rsel] 

NJd _ NJdat[rsel,c(Npair,Njack,Cwt+seq(Njr)-l,Cpr+Ip-l1] 

NJd[,seq(Njr)+2] _ NJd[, seq(Njr)+2]/1000 
NJd[,Kjr+2+Ip] _ NJd[,Njr+2+Ip]/100 
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bot _ seq(l ,dim(N Jd) [1]) [! duplicated(NJd [, c(l , 2)] '/,*'/, 
matrix(c(l,3)))] 

top _ c(bot[-l]-l, dira(NJd) [1] ) 
cnt _ top-bot+1 

clu _ rep(seq(l ,length(cnt) ) , cnt) 
sreg _ apply(matrix(seq(Npr*Njr)-l), 1, JKreg) 

sregi _ raatrix(sreg[l,] ,Njr,Npr) 
sregs _ raatrix(sreg[2,] ,Njr,Npr) 
sregv _ raatrix(sreg[3,] ,Njr,Npr) 

sresi _ apply (sregi , 2 , ssq) 
sress _ applj (sregs , 2 , ssq) 
sresv _ apply(sregv,2,ssq) 

svari _ cbind(sresi , apply (sresi , 1 ,raean) ) 
svars _ cbind(sress ,apply(sress , 1 .mean) ) 

tits 
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svarv _ cbind(sresv , apply (sresv , 1 ,mean) ) 

svari[3, lp+1] _ svari [3 , lp+ 1] + (lp+l)/lp*var(svari[2,] ) 
svars [3,lp+l] _ svars [3, lp+1] + (lp+l)/lp*var(svars[2,] ) 

svars[3,] _ sqrt (svars [3 , ] ) 
svari[3,] _ sqrt(svari[3,] ) 
svarv _ svarv [-3,] 

dimnames (svari ) _ list(c("Intercept" , "Jackknife intercept", 
"JK. st. err. ") ,c(paste("Pl.val." ,Ip) , "Overall")) 

dimnames (svars ) _ list(c("Slope" , "Jackknile slope", 
"JK. st. err. " ) , c (paste( "PI . val . " , Ip) , "Overall")) 

dimnames (svarv) _ list(c("Res. var.", "Jackknile res. var . " ) , 

c(paste("Pl. val. ",Ip) ."Overall")) 

JKlit _ list (Students . clusters=c(dim(NJd) [1] , length(cnt ) ) , 

counts=cnt, intercept=svari , slope=svars, Res . variance=svarv , 
proc . time= (proc . time( )- start ) [l :3] ) 



The program for I lie ni<uii , l-li;tt*''ti est iinalnr is given below. The nesting structure (given hy but, fop, 
and tin for i'liihtci>, and slwl, sijtji ami s< nl for Mrata) is required, hut only one set of sampling weights is 
used I he no! ;il inii i> similar to thai in fit Imt Spins fu net ions and in t Ik- t ext . 

# Regression with KAEP State data using Model-based methods 

# Filename NW.Reg 

# the same data as in NW.reg (jackknife) 

# Select all the subjects 

rsel _ (NJdat[,91]<6)&(NJdat[,92]<6)&(NJdat[,93]<6)& 
(NJdat[,94]<6)&(NJdat[,95]<6)&(NJdat[,96]<6)& 
(NJdat[,97]<6)&(NJdat C,98]<6) 

xvar _ tlJdat[,91]+NJdat[,92]+NJdat[,93]+tlJdat[ > 94] + 



JKf it 





N Jdat [ ,95] +N Jdat [ , 96] +NJdat [ , 97] +NJdat [, 98] 

# the stratum indicator 
STR _ 11 

start _ proc.tiraeO 
### xv _ xvar[rsel] 

tJJd _ NJdat[rsel,c(Npair ,Njack,STR,Cwt,Cpr+Ip-l)3 

NJd[,4] _ NJd[,4]/1000 
NJd[,4+Ip] _ NJd[,4+Ip]/l00 

# clustering 

bot _ seq(l,dim(NJd) [1] ) [! duplicated(NJd[ , c(l ,2)] '/,*'/. 

matrix(c(l ,3) ) )] 
top _ c(bot[-l]-l, dim(NJd)[l]) 
cnt _ top-bot+i 

Ncl <- length(cnt) 

clu <- rep(seq( 1 ,Ncl) , cnt ) 

# stratification 

# stratifying variable (student- and cluster-level) 

str <- NJd[,3] 
cstr <- unique(str) 

# recode to strata 1,2, nstr 

str <- match(str , cstr) 
Str <- str [top] 
nstr <- length(cstr) 



# the delimiters for the strata 

sbot <- seq(l,Ncl) [ !duplicated(Str)] 
stpp <- c(sbot [-1] -1 ,Ncl) 
sent <- stpp-sbot+1 

# the y-variate (first plausible value) and weights 
w <- NJd[,4] 

# total weight 
Wt <- sum(w) 

# weighted within-cluster totals for 1, x, y, x"2, xy, 

Mfit _ listO 
for (i in Ip) 
{ 

yv <- NJd[,4+i] 

Tcls _ cbindCtapplyCw.clu.sum) ,tapply(xv*w, clu, sum) , 
tapply (yv*w , clu , sum) , tapply (xv" 2*w , clu , sum) , 
tapply (xv*y v*w , clu , sum) , tapply (yv"2*w , clu , sum) ) 

# sample means of x, y, x"2, xy , y'2 
WMn <- apply (Tcls ,2,siun)/Wt 

# regression estimate 

nume _ WMn [5] - WMn[2]*WMn[3] 
deno _ WMn [4] - WMn [2] "2 
beta _ nume/deno 
alph _ WMn [3] - WMn [2] 

sign _ WMn [6] - WMn[3]"2 - WMn[5] + WMn[2] *WMn[3] 
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strr _ sqrt(sigm/deno/length(yv)) 



## sampling variance estimation 

dt _ cbind(xv,yv,xv'2,xv*yv,yv"2) - matrix (WMn f- 1] , 
nro»=Iength(xv) ,ncol=5,byrow=T) 

## effective sample size (A) 

W2s _ tapply (w~2 , clu, sum) 
nA _ Tcls[,l]-2/W2s 

S2» <- array(0,c(Ncl,lp,lp)) 

for (i m l:Mcl) 

S2w[i,,] <- t(dt[bot[i] :top[i] ,])'/.*'/.dt[bot[i] :top[i] ,]/ 
(nA[i]-l) 

## withm-stratum totals 

stra <- rep(seq(length(scnt) ) , sent) 

Ucls <- matrix(tapply(Tcls [, 1] , stra, sum)) 

for (1 in 2:dim(Tcls) [2] ) 

Ucls <~ cbmd(Ucls,tapply(Tcls[,i] ,stra,sum)/Ucls[,l]) 

TclsL,-l] _ Tcls [,-l]/matrix(Tcls[, 1] ,nrow=dim(Tcls) [1] , 
ncol=dim(Tcls) [2]-D 

VrEst <- t(Tcls[ ) -l]-Ucls[str a ,-l])y.*'/ 1 ((Tcls[,-i]-Ucls[str a ,-l])* 
matrix(Tcls [, 1] , nrow=dim(Tcls ) [1] ,ncol=dim(Tcls) [2]-l) ) 



WCom _ Tcls [, 1]-Tcls[, 1] "2/Ucls [stra, 1] 

sHM _ S2w[i , ,]*UCom[l]/nA[l] 
for (l in 2:Ncl) 
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sMM _ sMH + S2w[i, ,]*WCom[i]/nA[i] 

VHat2 _ (VrEst - sMM)/sum(HCom) 

# variance of the weighted mean 

nB _ Wt"2/sum(Tcls[,l]-2) 

varm _ S2w [1 , , ] *W2s [1] 

for (i in 2:Mcl) 

varm _ varm + S2w [i , , ] *M2s [i] 

varm _ ( VHat2+varm/sum(Tcls [ , 1] "2 ) ) /nB 

## varm (5x5) contains the variance matrix for 
m (x,y,x"2,xy,y~2) 

## sampling variation of the regression parameter estimate 
## numerator variance 

nuvar _ varm [4, 4] + varra[l , l] * varm [2 ,2] + varm[l ,2] "2 + 
WMn[2] "2*varm[2,2] + WMn[3] "2*varm[l , 1] - 2*WMr.[2] *varm[2,4] - 
2*WMn [3] *varm r l ,4] + 2*WMn [2] *WMn [3] *varm [1 , 2] 

## denominator variance 

devar _ varm [3, 3] + 2*varm [1 , 1] "2 + 4*WHn[2] "2*varm[l , 1] - 
4*VMn[2] *varm[l ,3] 

ft the estimated covariance of the numerator and denominator 

covr _ varm [3, 4] - 2*WMn [2] *varm [1 , 4] - WMn[2] *varm [2 , 3] - 
WHn[3]*varm[l ,3] + 2*varm[l , l] * (varm [1,2] + WHn [2] *WHn [3] ) + 
2*WHn[2] -2*varm[i,2] 

*# expectation and variance assuming COVARIANCE of the numerator 
tttt and denominator equal to cvr, and the estimated covariance 
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cvr _ c( (-0 . l+seq( 11 ) /lO) *sqrt (nuvar*devar) , covr) 

ebet _ nume/deno + cvr/deno"2 - devar*nume/deno"3 

vbet _ nuvar/deno"2 + devar*nume~2/deno~4 - 2*cvr*nume/deno"3 

## residual variance 

## numerator 

ul _ nuvar + (WMn[5] - WMn[2] *WHn[3] - varm[l , 2] ) "2 
## denominator 

u2 _ WMn[4] - varmCl.i] - WMn[2]"2 

fi# variance of numerator ia 3*nuvar"2 
## variance of denominator is devar 

cvr _ c( (-0 . I+seq( 1 1)/' *sqrt(3*devar)*nuvar,covr*sqrt(3*nuvar) ) 

rsig _ WMn[6] + varm[2,2j - WHn[3] "2 - 
ul/u2*(i + devar/u2"2) + cvr/u2"2 

paste("covariance ",covr) 
paste ("correlation ",crre) 



"Done 



## sampling variance of the numerator and denominator 



nuvar ; devar ; 



## 



estimates and standard errors 



rout _ rbind(WHn[3] -ebet+VfMn[2'J , ebet , vbet , rsig, covr/ 



71 



ERIC 




sqrt (nuvar+devar ) ) 

dimnames(rout) _ list (c ("Intercept" , "Slope" , "St . Err .SI . " , 
"Res .var . " , "Covariance" ) ,c(paste( "Correlation" , 
(seq(il)-i) ,'7l0") ."Est.corr.")) 
HfitCCi]] _ rout 
} 

Mf it 

## Summarize 

MfitS _ matrix(0,5,12) 

mns _ matrix(0,lp, 12) 

for (i in Ip) 
{ 

HfitS _ MfitS + HfitCCi]] 
mnsCi,] _ MfitCCi]] [2,] 
> 



HfitS _ MfitS/lp 

MfitS £3,] _ sqrt (HfitS [3,] + applydnns ,2 ,var)*(i+l/lp) ) 



HfitS 
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